Key Features
Smart Crawling
Automatically crawls your website respecting robots.txt and sitemap.xml. Configure crawl depth and rate limits to suit your needs.
Path Control
Specify allowed and disallowed paths to control exactly which parts of your website should be included in the llms.txt file.
Real-time Progress
Watch the crawling process in real-time with detailed progress updates and statistics about your website's content.
Advanced Capabilities
Content Optimization
- Automatic metadata extraction
- Structured content organization
- Intelligent content prioritization
Performance & Safety
- Rate limiting protection
- Automatic error recovery
- Resource-friendly crawling
Quick Start Guide
- Enter your website URL in the input field above
- Configure crawl settings (depth, rate limit) according to your needs
- Add any specific paths to include or exclude
- Click "Generate" and watch the progress in real-time
- Download your generated llms.txt file and place it in your website's root directory
Pro Tip: Start with a lower crawl depth and increase it if needed. This helps you understand your website's structure and optimize the generated file.
What is llms.txt?
"Large language models increasingly rely on website information, but face a critical limitation: context windows are too small to handle most websites in their entirety."
Benefits
- Control AI content access
- Optimize for LLMs
- Structured information
- Clear usage guidelines
Technical Details
- Markdown format
- Root directory placement
- UTF-8 encoding
- Standard compliance