>
Tired of 403s and blank pages when scraping JavaScript-heavy websites?
You're not alone β and that's exactly why I built ScrapeSome.
π What Is ScrapeSome?
ScrapeSome is a developer-friendly Python library that makes scraping modern websites simple β even the ones loaded with dynamic JavaScript or tough anti-bot protections.
It combines:
- β Sync and async support
- π Automatic Playwright fallback for headless browser rendering
- π» CLI support: scrape straight from your terminal
- π‘οΈ Built-in error handling, timeouts, and retries
- π Output formats:
HTML,Markdown,text, orJSON
Itβs fast, lightweight, and requires zero boilerplate.
π§ Why I Built It
I kept hitting walls on scraping projects:
- Pages rendered everything with JavaScript
- APIs were locked down or undocumented
-
requests,Scrapyfailed or got 403 request error - Setting up full browser automation felt too heavy for small jobs
So I built ScrapeSome β to fill the gap between requests and full-on headless scraping frameworks.
βοΈ Quick Example
from scrapesome import sync_scraper
html = sync_scraper("https://example.com")
html = sync_scraper(
"https://example.com",
force_playwright=True,
output_format="markdown",
user_agents=["Mozilla/5.0"]
)
π» CLI usage
scrapesome scrape --url https://example.com --output-format json
You can even configure behavior with environment variables β great for scripting.
π¦ Install It
pip install scrapesome
π§ͺ Try it out on PyPI:
π https://pypi.org/project/scrapesome/
π Links
- π§ GitHub: github.com/scrapesome/scrapesome
- π Docs: scrapesome.onrender.com
- π Full blog post: Medium
π Feedback Welcome
This is an early release, and Iβd love to hear your thoughts.
Try it, break it, file issues, suggest features β or just β the repo if you like the idea!
Happy scraping! π·οΈ
β Vishnu Vardhan Reddy
Top comments (0)