How to Build an AI Agent from Scratch | Step By Step Comprehensive Guide | (Without Any Coding Knowledge)

Creating AI agents that can automate tasks on your behalf has never been easier. You don't need coding skills or technical expertise to build powerful AI agents that can search the web, book flights, shop online, or even apply for jobs for you.

Here's a comprehensive guide based on the latest tools and techniques that will help you create custom AI agents that work in the browser, and you can even tune them to work offline without any internet connection.

So let's get into it.

Here is an example of what type of AI agents are going to be created

Using Browser Use

So to create AI agents, we are going to use Browser. It is a free, open-source tool that allows you to create AI agents capable of automating browser-based tasks.

Note: The entire installation and use of the browser is based on macOS; for other operating systems, the process might change. You can check out the other OS process from browser use and webui.

Step 1: Install Python

Go to python.org/downloads
Download and install the latest version of Python for your operating system
Verify the installation by opening a terminal and typing python3 --version or python3

Step 2: Install Browser Use

Open your terminal
Run the command: pip3 install browser-use
Next, install Playwright: playwright install

The installation process might take some time. Don't close the tab while installing.

To know more about this process, you can check out the Browser Use GitHub Page

Step 3: Set Up the Browser Use Web UI

We have set up the core browser use function, but to access and prompt it, we have to use the Web UI. To do that, follow this procedure.

Create a new folder on your computer (e.g., "AI-agent-creation")
Then you have to enter the following commands:

cd desktop(or the folder saved location)
cd AI-agent-creation
Git clone https://github.com/browser-use/web-ui.git

So what we did is we cloned the GitHub repository in that particular folder, and you can access it through the command cd web-ui in the terminal. But don't worry, we are going to explain it in detail in the next steps.

So the complete process of setting up is now done. Now what’s left is to just install some dependencies and set up the environment.

Step 4: Install Dependencies

Now we are going to install some dependencies. Make sure you are in the cd web-ui sub-folder, same as in the previous image.

Then follow these steps

Install UV (a fast Python package manager):

To install UV:
Then install this command: uv venv --python 3.11

Note: If you are facing any issues or unable to load, then use this command: curl -LsSf https://astral.sh/uv/install.sh | sh and then uv venv --python 3.11

Now to activate: source .venv/bin/activate

Activate the environment (the command may vary depending on your OS). To know more details check out Browser-use/webui

Now we should Install requirements:

First to install python packages enter: uv pip install -r requirements.txt
Next we need to install playwright: playwright install

Finally you have to enter this command: python webui.py --ip 127.0.0.1 --port 7788

Wait for a few seconds, and then the process will be completed. After that, you will get a URL in the terminal. Copy the URL and paste it in a browser.

So you have completed all the installation processes; now you just need to choose an LLM model, whether you want to use an LLM model API or use local LLM models like LLama and DeepSeek to have all the processes run locally without the internet.

Let’s first look at how to run AI agents with API keys

Step 5: Get an LLM API Key

Choose one of these options:

Or any other LLM API providers and get their API key.

We suggest using the Gemini API as it provides a free tier. To get the Gemini API, go to Google AI Studio, and from there you can get the Gemini API key.

Once you get the API key, go to the LLM configuration tab in the browser use and choose Google as the LLM provider, as we are using the Google API. Then select the model you are going to use and enter your API key below.

Optional: Using Local LLM models

If you don’t want to use any API keys or use the browser use without internet connection for privacy then you can use Local LLM models like DeepSeek, Llama, Qwen etc

You can use Ollama and download the models

Install Ollama on your computer and then use the Ollama installation codes in the terminal to download your required LLM model.

For a detailed explanation check out this article

Once you have installed the model you like, enter in the terminal: Ollama list

This will show all the downloaded models on your computer

Then you can directly choose the LLM provider and model name in the LLM configuration and run custom AI agents.

But sometimes the browser used may not support the local LLM model; in our case, this time we were only able to choose between Deepseek-reasoner and Deepseek-chat. However, the availability of different local LLM models changes from time to time.

Also, it should be noted that only high parameters like DeepSeek 14b can possibly run AI agents, and even 14b will also have processing times of 20 minutes for even small AI agents.

It is recommended to use LLM APIs like Google API with a free tier or paid options, but if you have high-powered computers, then you can install larger LLM models which will perform better.

To Activate Browser Use

If you want to reuse the browser use, then here are the quick steps of the command you need to use to reinitiate browser use.

Here are the steps

Cd desktop (the location you stored your folder)
Cd ai-agent-creation(folder name)
Cd web-ui
uv venv --python 3.11
source .venv/bin/activate
uv pip install -r requirements.txt
playwright install
python webui.py --ip 127.0.0.1 --port 7788

Building AI agents

Now you just have to give a good prompt to create your custom AI agent.

As an example, we created an agent that will log into a Google account, write an outline for an essay in the Docs, and then send an email regarding the essay to a Gmail user.

Prompt:

name:************

password:***********

Then open Google Docs and write an in-depth outline for the essay on the significance of Al Agents. Next, open Gmail and write an email saying "I have been working on

the essay of significance of Al agents and I will submit it as soon as possible" to the email address ***************, saying "Hi, working on the essay" in the subject line. Make sure to send it and not keep it as a draft.

This is just a use case; you can write a prompt that will suit your needs.

The AI agent might not get the exact output you want the first few times, but tuning the prompt a few times will create the perfect AI agent for your needs.

DeepResearch

Just like ChatGPT's deep research, you can also use the Deep Research feature in the browser. Use it and type the topic you need research on, and in a few minutes, the AI agent will provide detailed information on it.

Conclusion

Getting started with Browser Use opens up a world of possibilities for automating your online tasks. By following the steps we've covered, you can create AI agents that handle everything from document creation to email management without constant supervision.

Remember that your first attempts might need some tweaking, but with a few adjustments to your prompts, you'll soon have agents that perfectly match your needs. The beauty of Browser Use lies in its flexibility—whether you prefer using API keys from providers like Gemini or running everything locally for privacy.

As you become more comfortable with the tool, you'll discover countless ways to save time and streamline your daily workflow. Browser Use puts the power of AI automation in your hands, making technology work for you instead of the other way around.

FAQ’s

1. What is Browser Use and why should I use it for AI agents?

Browser Use is a free, open-source tool that lets you create AI agents to automate web tasks. It's valuable because it works with popular websites and requires minimal coding knowledge to set up and customize.

2. Do I need programming experience to create AI agents with Browser Use?

No, you don't need extensive programming experience. While basic knowledge of Python helps, the step-by-step setup process is designed for beginners, and you can create agents primarily through natural language prompts.

3. Can Browser Use AI agents work with popular websites like Google and Gmail?

Yes, Browser Use AI agents can interact with popular websites including Google Docs, Gmail, social media platforms, and most other websites by automating browser actions just like a human would.

4. What LLM options are available for powering my Browser Use AI agents?

You can use API-based options like Gemini (recommended for its free tier), ChatGPT, or Claude. Alternatively, you can run local models like DeepSeek or Llama for complete privacy and offline use.

5. How long does it take to set up a working AI agent with Browser Use?

Setting up Browser Use typically takes 15-30 minutes for installation and configuration. Creating your first working AI agent might take another 15 minutes, depending on task complexity and prompt refinement.

How to Build an AI Agent from Scratch (Without Any Coding Knowledge) | Step By Step Guide | 2025