Skip to content

jemacchi/browser-use

Repository files navigation

🌐 Browser Use

Make websites accessible for AI agents 🤖.

GitHub starsLicense: MITPython 3.11+Discord

Browser use is the easiest way to connect your AI agents with the browser. If you have used Browser Use for your project feel free to show it off in our Discord.

Quick start

With pip:

pip install browser-use

(optional) install playwright:

playwright install

Spin up your agent:

fromlangchain_openaiimportChatOpenAIfrombrowser_useimportAgentimportasyncioasyncdefmain(): agent=Agent( task="Find a one-way flight from Bali to Oman on 12 January 2025 on Google Flights. Return me the cheapest option.", llm=ChatOpenAI(model="gpt-4o"), ) result=awaitagent.run() print(result) if__name__=="__main__": asyncio.run(main())

And don't forget to add your API keys to your .env file.

OPENAI_API_KEY= ANTHROPIC_API_KEY=

Demos

Prompt: Read my CV & find ML jobs, save them to a file, and then start applying for them in new tabs, if you need help, ask me.' (8x speed)
apply.to.jobs.8x.mp4
Prompt: Find flights on kayak.com from Zurich to Beijing from 25.12.2024 to 02.02.2025. (8x speed)

flight search 8x 10fps

Prompt: Solve the captcha. (2x speed)

Solving Captcha

Prompt: Look up models with a license of cc-by-sa-4.0 and sort by most likes on Hugging face, save top 5 to file. (1x speed)
hugging_face_high_quality.mp4

Features ⭐

  • Vision + html extraction
  • Automatic multi-tab management
  • Extract clicked elements XPaths and repeat exact LLM actions
  • Add custom actions (e.g. save to file, push to database, notify me, get human input)
  • Self-correcting
  • Use any LLM supported by LangChain (e.g. gpt4o, gpt4o mini, claude 3.5 sonnet, llama 3.1 405b, etc.)

Register custom actions

If you want to add custom actions your agent can take, you can register them like this:

You can use BOTH sync or async functions.

frombrowser_use.agent.serviceimportAgentfrombrowser_use.browser.serviceimportBrowserfrombrowser_use.controller.serviceimportController# Initialize controller firstcontroller=Controller() @controller.action('Ask user for information')defask_human(question: str, display_question: bool) ->str: returninput(f'\n{question}\nInput: ')

Or define your parameters using Pydantic

classJobDetails(BaseModel): title: strcompany: strjob_link: strsalary: Optional[str] =None@controller.action('Save job details which you found on page', param_model=JobDetails, requires_browser=True)asyncdefsave_job(params: JobDetails, browser: Browser): print(params) # use the browser normallypage=browser.get_current_page() page.go_to(params.job_link)

and then run your agent:

model=ChatAnthropic(model_name='claude-3-5-sonnet-20240620', timeout=25, stop=None, temperature=0.3) agent=Agent(task=task, llm=model, controller=controller) awaitagent.run()

Get XPath history

To get the entire history of everything the agent has done, you can use the output of the run method:

history: list[AgentHistory] =awaitagent.run() print(history)

More examples

For more examples see the examples folder or join the Discord and show off your project.

Telemetry

We collect anonymous usage data to help us understand how the library is being used and to identify potential issues. There is no privacy risk, as no personal information is collected. We collect data with PostHog.

You can opt out of telemetry by setting the ANONYMIZED_TELEMETRY=false environment variable.

Contributing

Contributions are welcome! Feel free to open issues for bugs or feature requests.

Local Setup

  1. Create a virtual environment and install dependencies:
# To install all dependencies including dev pip install . ."[dev]"
  1. Add your API keys to the .env file:
cp .env.example .env

or copy the following to your .env file:

OPENAI_API_KEY= ANTHROPIC_API_KEY=

You can use any LLM model supported by LangChain by adding the appropriate environment variables. See langchain models for available options.

Building the package

hatch build

Feel free to join the Discord for discussions and support.


Star ⭐ this repo if you find it useful!
Made with ❤️ by the Browser-Use team

About

Make websites accessible for AI agents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python100.0%