webtask
Easy-to-use LLM-powered browser automation — from autonomous tasks to element-level control.
Why webtask?
- High-level tasks: Describe what you want done — the agent figures out the steps
- Low-level control: Select any element with natural language — no CSS/XPath selectors needed
Quick Start
from webtask import Webtask
from webtask.integrations.llm import Gemini
wt = Webtask()
agent = await wt.create_agent(
llm=Gemini(model="gemini-2.5-flash"),
wait_after_action=1.0,
)
await agent.goto("https://practicesoftwaretesting.com")
await agent.wait(3)
# select: pick elements with natural language
search = await agent.select("the search input")
await search.fill("pliers")
# do: simple or complex tasks — agent figures out the steps
await agent.do("click search and add the first product to cart")
# extract: get structured data from the page
price = await agent.extract("the cart total price")
# verify: check conditions
assert await agent.verify("cart has 1 item")
Features
Four core operations
await agent.do("click search and add first product to cart") # Autonomous tasks
element = await agent.select("the search input") # Element selection
data = await agent.extract("the cart total", MySchema) # Data extraction
assert await agent.verify("cart has 1 item") # Verification
Stateful agents — Agent remembers context across tasks
await agent.do("Add pliers to cart")
await agent.do("Add a hammer too") # Remembers previous action
agent.clear_history() # Reset when needed
Two modes — DOM-based or pixel-based interaction
agent = await wt.create_agent(llm=llm, mode="dom") # Element IDs (default)
agent = await wt.create_agent(llm=llm, mode="pixel") # Screen coordinates
Browser integration — Works with new or existing browsers
agent = await wt.create_agent(llm=llm) # New browser
agent = await wt.create_agent_with_browser(llm=llm, browser=browser) # Existing browser
agent = wt.create_agent_with_context(llm=llm, context=context) # Existing context
agent = wt.create_agent_with_page(llm=llm, page=page) # Existing page
Error handling — Handle task failures gracefully