We earn commission when you buy through affiliate links.
This does not influence our reviews or recommendations.Learn more.
A detailed guide to web scraping using ChatGPT Code Interpreter and its plugins.
If youre not into creating some novelty, chances are you need some prerequisite information to begin.
Or, you might want to look into the competition for valuable input.
In addition, there can be countless reasons for someone to be interested in a specific websites content.
Web scraping is the process that serves such use cases.
And there are a few ways to go about that.
Alternatively, you may require a specific setup for on-premise processing.
Overview of ChatGPT for Web Scraping
Im not supposed to introduce ChatGPT to you.
In short, ChatGPT is a generative AI that responds like humans.
ChatGPT replies in text.
However, there areChatGPT pluginsthat enhance its capabilities in many ways.
And well be using one such plugin.
yo know that ChatGPT has free and paid versions.
In further sections, Ill illustrate the process step-by-step.
Disclaimer: Before proceeding yourself, hey confirm that the subject website allows scraping their content.
Next, clickNo plugins enabled, scroll down, and clickPlugin Store.
yo note that instead of No plugins enabled, youll have a plugin icon if one is active.
This will launch the Plugin store.
Select this plugin in the ChatGPT interface.
Once this is selected, one must prompt ChatGPT, mentioning the subject URL and the content for scraping.
I have done this for a few websites.
How about fetching every deal in a tabular format?
This is what I got:
The problem is this isnt a single case.
Youll find many such instances where the websites have anti-scraping measures.
The following sections entail one such solution.
I have taken this page for extraction:
We will begin by saving the webpage as HTML.
For that, go to the webpage and pressCtrl+S.
Now we have the file for scraping.
Lets figure out the prompt.
And getting these elements is fairly easy.
Right-click anywhere on the subject webpage and clickInspectfrom the pop-over.
First, pick the topmost icon (marked as 1).
This will highlight the details while you select elements from the page.
Next, snag the container element for any specific product.
c’mon ensure to opt for innermost container.
you could hover along, and it will keep highlighting.
Similarly, opt for samples for other elements.
You will have a few details, whereas everything will be in the embedded CSV file.
In such cases, you’re gonna wanna double-check and clean the data for any redundancies.
If there are any, you’re able to re-prompt ChatGPT to get a clean CSV.
Final Thoughts
ChatGPT does many things, and basic web scraping is one of them.
Agreed, it might not be suitable for someone scraping hundreds of pages.
Still, itll get you started in the right direction and ideal for a short scraping session.
In this guide, we have used one of its scraping plugins and Code Interpreter.
And to reiterate, go through the subject website terms before scraping.
PS: Check out thesecloud scraping solutionsand our ownGeekflare scraping API.