Software Engineer, Web Crawling
Department:
Job Summary
Exa is an applied AI lab building a search engine unlike the world has ever seen. We build massive-scale infra to crawl the entire web train state-of-the-art embedding models to process it and design super high performant vector databases to retrieve over it. We now power search for Cursor Cognition HubSpot and over 400000 developers and have raised $350m from Lightspeed Benchmark and a16z.
Our ultimate goal is to build perfect search over all the worlds information far beyond Google. If you want to build massive-scale ML systems that will define the way the new AI world consumes information this is the place for you.
As a Web Crawler engineer youd be responsible for crawling the entire web. Basically build Google-scale crawling!
Who You Are
You have extensive experience building and scaling web crawlers or would be excited to ramp up very quickly
You have experience with some high performance language (C Rust etc.)
You are familiar with TypeScript Playwright modern web design CDP (Chrome DevTools Protocol)
Youre comfortable optimizing a system to an exceptional degree
You care about the problem of finding high quality knowledge and recognize how important this is for the world
What You Could Do
Build a distributed crawler that can handle 100M pages per day
Optimize crawl politeness and rate limiting across thousands of domains
Design systems to detect and handle dynamic content JavaScript rendering and anti-bot measures
Create intelligent crawl scheduling and prioritization algorithms for maximum coverage efficiency
Logistics
Location: This is an in-person opportunity in Singapore.
Visas: Were happy to sponsor international candidates! While we cannot guarantee your visa we have historically been successful in sponsoring candidates from all over the world. If you receive an offer our team will work hard to get you a visa.
Benefits: We offer premium healthcare benefits (medical dental vision) fertility benefits 16 weeks of fully paid parental leave for all new parents and a monthly wellness stipend to all of our employees.
Required Experience:
IC
About Company
Real-time AI search engine with a powerful web search API, web crawling API, SERP API, and deep research tools. Search and extract structured content from websites and live data.