PubCrawl with Large Language Momentum
How do you quickly inspect a JavaScript-heavy website to resolve a security related question?
I tend to break open Burp Suite, or dive into my browsers DevTools.
Sometimes, I'll use one of those shady-looking websites that catalog or query the specific thing I want to discover.
For something more automated, I'll write a 10-line script to scrape a specific website using headless browser automation.
These tactics all do the job but bring their own friction.
What if I told you that with Large Language Momentum, you could quickly create a more capable, flexible tool?
That's exactly what I experienced recently when developing PubCrawl.
Instead of cobbling together another one-off script, I used AI assistance (Claude and Aider) to build a one-shot web scraping tool in about an hour.
The result?
A focused, simple scraper that adheres to the Unix philosophy of doing one thing well and playing nicely with other tools.
PubCrawl shines where curl falls short - on JavaScript-heavy websites. No more wrestling with DevTools or setting up inspection proxies. It uses Playwright to fully render pages, outputting everything in clean JSON that's ready for piping into other tools jq.
Key features:
-
Handles JavaScript-rendered content effortlessly
-
Fine-grained control over the response URLs to capture and content types to scrape
-
JSON output for easy integration with other tools
-
Designed for simplicity and reusability
Instead of accumulating a drawer full of single-use scripts, you can quickly develop more robust tools that adapt to various scenarios. It's particularly useful for cybersecurity tasks like reconnaissance, compliance checks, or threat intelligence gathering.
The real game-changer is how leading-edge generative LLMs lower the bar for creating reusable tools. Even as a casual Python programmer, I could build something far more capable than my usual quick scripts.
How might Large Language Momentum change your appetite for toolkit development?
Can you think of any one-off scripts you've written that could evolve into more versatile tools with AI assistance?
Related Posts
-
Chat Markup Language (ChatML)
Establishing Conversational Roles and Addressing Syntax-Level Prompt Injections
-
Unpacking AI Safety
Tackling AI Safety & Alignment Challenges Amid Rapid Progress and Potential Disruptions.
-
To ban or not to ban: Data privacy concerns around ChatGPT and other AI
What is your organisation doing to control the potential downside of services like ChatGPT, whilst capturing the upside?