
Build AI Agents With Web Scraping Using Pica & demlon
Summarize: ChatGPT Perplexity
Summarize:
ChatGPT Perplexity
{ "@context": "https://schema.org", "@type": "HowTo", "name": "Build AI Agents with Pica and demlon", "description": "A step-by-step tutorial for building an AI agent with Python that retrieves live web data, using Pica’s integration with demlon.", "step": [ { "@type": "HowToStep", "name": "Understand Pica and AI Agent Integrations", "text": "Learn what Pica is, its capabilities as an open-source platform for AI agent and SaaS integrations, and why AI agents need access to up-to-date web data via third-party APIs." }, { "@type": "HowToStep", "name": "Gather Prerequisites", "text": "Ensure you have Python 3.9+, a Pica account, a demlon API key, and an OpenAI API key." }, { "@type": "HowToStep", "name": "Set Up Your Python Project", "text": "Create and enter a project directory. Set up a virtual environment and install required packages such as langchain, langchain-openai, pica-langchain, and python-dotenv." }, { "@type": "HowToStep", "name": "Manage API Keys Securely", "text": "Use a .env file and the python-dotenv library to keep your OpenAI, Pica, and demlon API keys secure and out of your codebase." }, { "@type": "HowToStep", "name": "Configure Pica in Your Environment", "text": "Create a Pica account and obtain your Pica API key. Add it to your .env file for use in your code." }, { "@type": "HowToStep", "name": "Connect demlon Integration in Pica", "text": "In your Pica dashboard, find and connect the demlon integration, then add your demlon connection key to the .env file." }, { "@type": "HowToStep", "name": "Initialize the Pica Client in Python", "text": "Write Python code to load environment variables and initialize a Pica client using your API and demlon connection keys." }, { "@type": "HowToStep", "name": "Integrate and Configure OpenAI LLM", "text": "Instantiate the OpenAI Chat model within your code and ensure the OpenAI API key is included in your environment variables." }, { "@type": "HowToStep", "name": "Create Your Pica AI Agent", "text": "Combine the Pica client, the configured LLM, and agent options to build your fully integrated Pica agent in Python." }, { "@type": "HowToStep", "name": "Test Web Data Retrieval with demlon", "text": "Prompt your agent to retrieve live data from a web page, such as an Amazon product page. Print and review the returned, real web data." }, { "@type": "HowToStep", "name": "Run and Verify the Agent", "text": "Execute your Python script and observe the process: Pica connects your demlon integration and completes the web scraping task, returning live structured data." }, { "@type": "HowToStep", "name": "Expand Agent Capabilities", "text": "Explore additional demlon actions and Pica integrations, allowing your agent to gather and use data from multiple web sources and APIs." } ], "estimatedCost": { "@type": "MonetaryAmount", "currency": "USD", "value": "Free" }, "supply": [ { "@type": "HowToSupply", "name": "Python 3.9+ environment" }, { "@type": "HowToSupply", "name": ".env file for secrets" }, { "@type": "HowToSupply", "name": "Pica account" }, { "@type": "HowToSupply", "name": "demlon API key" }, { "@type": "HowToSupply", "name": "OpenAI API key" } ], "tool": [ { "@type": "HowToTool", "name": "Visual Studio Code or PyCharm" }, { "@type": "HowToTool", "name": "langchain" }, { "@type": "HowToTool", "name": "langchain-openai" }, { "@type": "HowToTool", "name": "pica-langchain" }, { "@type": "HowToTool", "name": "python-dotenv" } ], "totalTime": "PT1H" }
In this guide, you will see the following:
What Pica is and why it is an excellent choice for building AI agents that integrate with external tools.
Why AI agents require integration with third-party solutions for data retrieval.
How to use the built-in demlon connector in a Pica agent to fetch web data for more accurate responses.
Let’s dive in!
What Is Pica?
Pica is an open-source platform designed to quickly build AI agents and SaaS integrations. It provides simplified access to 125+ third-party APIs without requiring the management of keys or complex configurations.
The goal of Pica is to make it effortless for AI models to connect with external tools and services. With Pica, you can set up integrations in just a few clicks and then easily use them in your code. This enables AI workflows to handle real-time data retrieval, deal with complex automation, and more.
The project has rapidly gained popularity on GitHub, amassing over 1,300 stars in just a few months . That demonstrates its strong community growth and adoption.
Why AI Agents Need Web Data Integrations
Every AI agent framework inherits core limitations from the LLMs on which it is built. Since LLMs are pre-trained on static datasets , they lack real-time awareness and cannot reliably access live web content.
This often results in outdated answers or even hallucinations. To overcome these limitations, agents (and the LLMs they depend on) need access to trusted, up-to-date web data. Why web data specifically? Because the web remains the most comprehensive and current source of information available .
That is why an effective AI agent must be able to quickly and easily integrate with third-party AI web data providers. And this is exactly where Pica comes into play!
On the Pica platform, you will find over 125 available integrations , including one for demlon :

The demlon integration on Pica
The demlon integration empowers your AI agents and workflows to seamlessly connect to:
Web Unlocker API : An advanced scraping API that bypasses bot protections, delivering any web page’s content in Markdown format.
Web Scraper APIs : Specialized solutions for ethically extracting fresh, structured data from popular sites like Amazon, LinkedIn, Instagram, and 40 others.
These tools give your AI agents, workflows, or pipelines the ability to back their responses with reliable web data, extracted on the fly from relevant pages. See this integration in action in the next chapter!
How to Build an AI Agent That Can Retrieve Data From the Web with Pica and demlon
In this guided section, you will learn how to use Pica to build a Python AI agent that connects to the demlon integration. This way, your agent will be able to retrieve structured web data from sites like Amazon.
Follow the steps below to create your demlon–powered AI agent with Pica!
Prerequisites
To follow this tutorial, you need:
Python 3.9 or higher installed on your machine (we recommend the latest version).
A Pica account.
A demlon API key.
An OpenAI API key .
Do not worry if you do not have a demlon API key or a Pica account yet. We will show you how to set them up in the next steps.
Step #1: Initialize Your Python Project
Open a terminal and create a new directory for your Pica AI agent project:
mkdir pica-demlon-agent
The
pica-demlon-agent
folder will contain the Python code for your Pica agent. This will use demlon integration for web data retrieval.
Next, navigate into the project directory and create a virtual environment inside it:
cd pica-demlon-agent
python -m venv venv
Now, open the project in your favorite Python IDE. We recommend Visual Studio Code with the Python extension or PyCharm Community Edition .
Inside the project folder, create a new file named
agent.py
. Your directory structure should look like this:
pica-demlon-agent/
├── venv/
└── agent.py
Activate the virtual environment in your terminal. In Linux or macOS, run:
source venv/bin/activate
Equivalently, on Windows, fire this command:
venv/Scripts/activate
In the next steps, you will install the required Python packages. If you prefer to install everything right now, with your virtual environment activated, simply run:
pip install langchain langchain-openai pica-langchain python-dotenv
You are all set! You now have a Python development environment ready to build an AI agent with demlon integration in Pica.
Step #2: Set up Environment Variables Reading
Your agent will connect to third-party services like Pica, demlon, and OpenAI. To keep these integrations secure, avoid hardcoding your API keys directly into your Python code. Instead, store them as environment variables.
To make loading environment variables easier, utilize the
python-dotenv
library. In your activated virtual environment, install it with:
pip install python-dotenv
Next, import the library and call
load_dotenv()
at the top of your
agent.py
file to load your environment variables:
import os
from dotenv import load_dotenv
load_dotenv()
This function allows your script to read variables from a local
.env
file. Create this
.env
file in the root of your project directory. Your folder structure will look like this:
pica-demlon-agent/
├── venv/
├── .env # <-----------
└── agent.py
Great! You are now set up to securely handle your API keys and other secrets using environment variables.
Step #3: Configure Pica
If you have not done so yet, create a free Pica account . By default, Pica will generate an API key for you. You can use that API key with LangChain or any other supported integration.
Visit the “Quick start” page and select the “LangChain” tab:

Opening the “LangChain” tab
Here, you will find instructions to get started with Pica in LangChain. Specifically, follow the installation command shown there. In your activated virtual environment, run:
pip install langchain langchain-openai pica-langchain
Now, scroll down until you reach the “API Key” section:

Clicking the “Copy to clipboard” button in the “API Key” section
Click the “copy to clipboard” button to copy your Pica API key. Then, paste it into your
.env
file by defining an environment variable like this:
PICA_API_KEY="<YOUR_PICA_KEY>"
Replace the
<YOUR_PICA_KEY>
placeholder with the actual API key you just copied.
Fantastic! Your Pica account is now fully configured and ready to use in your code.
Step #4: Integrate demlon in Pica
Before getting started, make sure to follow the official guide to set up a demlon API key . You will need this key to connect your agent to demlon using the built-in integration available on the Pica platform.
Now that you have your API key, you can add the demlon integration in Pica.
In the “LangChain” tab of your Pica dashboard, scroll down to the “Recent Integrations” section and press the “Browse integrations” button:

Clicking the “Browse integrations” button
This will open a modal. In the search bar, type “brightdata” and select the “BrightData” integration:

Selecting the demlon integration
You will be prompted to enter the demlon API key you created earlier. Paste it in, then click the “Connect” button:

Pasting your demlon API key and pressing “Connect”
Next, on the left-hand menu, click on the “Connected Integrations” menu item:

Clicking the “Connected integrations” menu item
On the “Connected Integrations” page, you should now see demlon listed as a connected integration. From the table, click the “Copy to clipboard” button to copy your connection key:

Copying the Pica demlon connection key
Then, paste it into your
.env
file by adding:
PICA_BRIGHT_DATA_CONNECTION_KEY="<YOUR_PICA_BRIGHT_DATA_CONNECTION_KEY>"
Be sure to replace the
<YOUR_PICA_BRIGHT_DATA_CONNECTION_KEY>
placeholder with the actual connection key you copied.
You will need that value to initialize your Pica agent in code, so it knows to load the configured demlon connection. See how to do that in the next step!
Step #5: Initialize Your Pica Agent
In
agent.py
, initialize your Pica agent with:
pica_client = PicaClient(
secret=os.environ["PICA_API_KEY"],
options=PicaClientOptions(
connectors=[
os.environ["PICA_BRIGHT_DATA_CONNECTION_KEY"]
]
)
)
pica_client.initialize()
The snippet above initializes a Pica client, connecting to your Pica account using the
PICA_API_KEY
secret loaded from your environment. Also, it selects the demlon integration you configured earlier from among all available connectors.
This means any AI agents you create with this client will be able to leverage demlon’s real-time web data retrieval capabilities.
Do not forget to import the required classes:
from pica_langchain import PicaClient
from pica_langchain.models import PicaClientOptions
Terrific! You are ready to proceed with LLM integration.
Step #6: Integrate OpenAI
Your Pica agent will need an LLM engine to understand the input prompts and perform the desired tasks using demlon’s capabilities.
This tutorial uses the OpenAI integration, so you’ll define the LLM for your agent in your
agent.py
file like this:
llm = ChatOpenAI(
model="gpt-4o",
temperature=0,
)
Note that
all Pica LangChain examples
in the documentation use
temperature=0
. This ensures the model is deterministic, always producing the same output for the same input.
Remember that the
ChatOpenAI
class comes from this import:
from langchain_openai import ChatOpenAI
In particular,
ChatOpenAI
expects your OpenAI API key to be defined in an environment variable named
OPENAI_API_KEY
. So, in your
.env
file, add:
OPENAI_API_KEY=<YOUT_OPENai_API_KEY>
Replace the
<YOUR_OPENAI_API_KEY>
placeholder with your actual OpenAI API key.
Amazing! You now have all the building blocks to define your Pica AI agent.
Step #7: Define Your Pica Agent
In Pica, an AI agent consists of three main parts:
A Pica client instance
An LLM engine
A Pica agent type
In this case, you want to build an AI agent that can call OpenAI functions (which in turn connect to demlon’s web retrieval capabilities via the Pica integration). Thus, create your Pica agent like this:
agent = create_pica_agent(
client=pica_client,
llm=llm,
agent_type=AgentType.OPENAI_FUNCTIONS,
)
Do not forget to add the necessary imports:
from pica_langchain import create_pica_agent
from langchain.agents import AgentType
Marvelous! Now all that is left is to test your agent on a data retrieval task.
Step #8: Interrogate Your AI Agent
To verify that the demlon integration works in your Pica agent, give it a task it normally could not perform on its own. For example, ask it to retrieve updated data from a recent Amazon product page, such as the Nintendo Switch 2 (available at
https://www.amazon.com/dp/B0F3GWXLTS/
).
To do so, invoke your agent with this input:
agent_input = """
Use demlon to run a web scraping task and return the results from the following Amazon product URL:
https://www.amazon.com/dp/B0F3GWXLTS/
"""
result = agent.invoke({
"input": agent_input
})
Note: The prompt is intentionally explicit. It tells the agent exactly what to do, which page to scrape, and which integration to use. This ensures that the LLM will leverage the demlon tools configured through Pica, producing the expected results.
Finally, print the agent output:
print(f"\nAgent Result:\n{result}")
And with this last line, your Pica AI agent is complete. Time to see it all come together in action!
Step #9: Put It All Together
Your
agent.py
file should now contain:
import os
from dotenv import load_dotenv
from pica_langchain import PicaClient, create_pica_agent
from pica_langchain.models import PicaClientOptions
from langchain_openai import ChatOpenAI
from langchain.agents import AgentType
# Load environment variables from .env file
load_dotenv()
# Initialize Pica client with the specific demlon connector
pica_client = PicaClient(
secret=os.environ["PICA_API_KEY"],
options=PicaClientOptions(
connectors=[
os.environ["PICA_BRIGHT_DATA_CONNECTION_KEY"] # Load the specific demlon connection
]
)
)
pica_client.initialize()
# Initialize the LLM
llm = ChatOpenAI(
model="gpt-4o",
temperature=0,
)
# Create your Pica agent
agent = create_pica_agent(
client=pica_client,
llm=llm,
agent_type=AgentType.OPENAI_FUNCTIONS,
)
# Execute a web data retrieval task in the agent
agent_input = """
Use demlon to run a web scraping task and return the results from the following Amazon product URL:
https://www.amazon.com/dp/B0F3GWXLTS/
"""
result = agent.invoke({
"input": agent_input
})
# Print the produced output
print(f"\nAgent Result:\n{result}")
As you can see, in less than 50 lines of code, you built a Pica agent with powerful data retrieval capabilities. This is possible thanks to the demlon integration available directly on the Pica platform.
Run your agent with:
python agent.py
In your terminal, you should see logs similar to the following:
# Omitted for brevity...
2025-07-15 17:06:03,286 - pica_langchain - INFO - Successfully fetched 1 connections
# Omitted for brevity...
2025-07-15 17:06:05,546 - pica_langchain - INFO - Getting available actions for platform: demlon
2025-07-15 17:06:05,546 - pica_langchain - INFO - Fetching available actions for platform: demlon
2025-07-15 17:06:05,789 - pica_langchain - INFO - Found 54 available actions for demlon
2025-07-15 17:06:07,332 - pica_langchain - INFO - Getting knowledge for action ID: XXXXXXXXXXXXXXXXXXXX on platform: demlon
# Omitted for brevity...
2025-07-15 17:06:12,447 - pica_langchain - INFO - Executing action ID: XXXXXXXXXXXXXXXXXXXX on platform: demlon with method: GET
2025-07-15 17:06:12,447 - pica_langchain - INFO - Executing action for platform: demlon, method: GET
2025-07-15 17:06:12,975 - pica_langchain - INFO - Successfully executed Get Dataset List via demlon
2025-07-15 17:06:12,976 - pica_langchain - INFO - Successfully executed action: Get Dataset List on platform: demlon
2025-07-15 17:06:16,491 - pica_langchain - INFO - Executing action ID: XXXXXXXXXXXXXXXXXXXX on platform: demlon with method: POST
2025-07-15 17:06:16,492 - pica_langchain - INFO - Executing action for platform: demlon, method: POST
2025-07-15 17:06:22,265 - pica_langchain - INFO - Successfully executed Trigger Synchronous Web Scraping and Retrieve Results via demlon
2025-07-15 17:06:22,267 - pica_langchain - INFO - Successfully executed action: Trigger Synchronous Web Scraping and Retrieve Results on platform: demlon
In simpler terms, this is what your Pica agent did:
Connected to Pica and fetched your configured demlon integration.
Discovered there were 54 available tools on the demlon platform.
Retrieved a list of all datasets from demlon .
Based on your prompt, it selected the “Trigger Synchronous Web Scraping and Retrieve Results” tool and used it to scrape fresh data from the specified Amazon product page. Behind the scenes, this triggers a call to the demlon Amazon Scraper , passing in the Amazon product URL. The scraper will retrieve and return the product data.
Successfully executed the scraping action and returned the data.
Your output should look something like this:

The output produced by the Pica agent
Paste this output into a Markdown editor , and you will see a well-formatted product report like this:

The formatted output data
As you can tell, the agent was able to produce a Markdown report containing meaningful, up-to-date data from the Amazon product page. You can verify the accuracy by visiting the target product page in your browser:

The Nintendo Switch 2 target page on Amazon
Notice how the produced data is real data from the Amazon page, not hallucinated by the LLM. That is a testament to the scraping done through demlon’s tools. And this is just the beginning!
With the wide range of demlon actions available in Pica, your agent can now retrieve data from virtually any website. That includes complex targets like Amazon that are known for strict anti-scraping measures (such as the notorious Amazon CAPTCHA ).
Et voilà! You just experienced seamless web scraping, powered by demlon integration within your Pica AI agent.
Conclusion
In this article, you saw how to use Pica to build an AI agent that can back its responses with fresh web data. This was made possible thanks to Pica’s built-in integration with demlon. The Pica demlon connector gives AI the ability to fetch data from any web page.
Keep in mind, this was just a simple example. If you want to build more advanced agents, you will need robust solutions for fetching, validating, and transforming live web data. That is precisely what you can find in the demlon AI infrastructure .
Create a free demlon account and start exploring our AI-ready web data extraction tools!