search
communitySearch the web via the Bright Data CLI — `bdata search` for Google/Bing/Yandex SERP, `bdata discover` for intent-ranked semantic results. Use when the user wants SERP results, needs URLs to feed into scraping, or wants semantic web discovery with optional page content. Hands off to `scrape` once target URLs are chosen, and to `data-feeds` when the user wants structured data from a known platform. Requires the Bright Data CLI; proactively guides install + login if missing.
name: search
description: Search the web via the Bright Data CLI — bdata search for Google/Bing/Yandex SERP, bdata discover for intent-ranked semantic results. Use when the user wants SERP results, needs URLs to feed into scraping, or wants semantic web discovery with optional page content. Hands off to scrape once target URLs are chosen, and to data-feeds when the user wants structured data from a known platform. Requires the Bright Data CLI; proactively guides install + login if missing.
Bright Data — Search
Find things on the web. Two commands live in this skill:
bdata search— classic keyword SERP (Google/Bing/Yandex). Best when you want "what ranks for keyword X."bdata discover— AI intent-ranked discovery with optional page content. Best when you want "pages about topic Y that match intent Z."
For structured data from a known platform (Amazon, LinkedIn, TikTok, …), stop and use data-feeds instead.
Setup gate (run first)
if ! command -v bdata >/dev/null 2>&1; then
echo "bdata CLI not installed — see bright-data-best-practices/references/cli-setup.md"
elif ! bdata zones >/dev/null 2>&1; then
echo "bdata not authenticated — run: bdata login (or: bdata login --device for SSH)"
fi
Halt and route to skills/bright-data-best-practices/references/cli-setup.md if either check fails.
Pick your path
| Situation | Action |
|---|---|
| Single keyword query, just SERP | bdata search "<query>" --engine google --json --pretty |
| Paginated SERP (more results) | loop --page 0, --page 1, … (0-indexed) |
| Multiple queries | shell loop over a queries file |
| Intent-ranked / semantic (not keyword) | bdata discover "<query>" --intent "<intent>" --num-results 20 |
| Want page bodies along with results, one pass | bdata discover ... --include-content |
| News / images / shopping SERP | bdata search "<query>" --type news (or images, shopping) |
| Want Amazon/LinkedIn/TikTok/… structured data | stop — hand off to data-feeds |
| Have URLs, want content | hand off to scrape |
Action
Core commands:
# Google SERP, structured JSON
bdata search "site:example.com privacy policy" --engine google --json --pretty
# Localized Bing (German results, German language)
bdata search "datenschutz" --engine bing --country de --language de --json
# Second page of results (0-indexed)
bdata search "machine learning papers" --page 1 --json
# Mobile SERP (rankings differ from desktop)
bdata search "best coffee shops" --device mobile --json
# News vertical
bdata search "openai" --type news --json --pretty
# Intent-ranked discovery
bdata discover "enterprise LLM platforms" \
--intent "vendor pages with pricing" \
--num-results 15 --json
# Discovery with page content in markdown
bdata discover "webhook best practices" \
--include-content --num-results 10 -o results.json
# Date-filtered discovery
bdata discover "react server components" \
--start-date 2025-01-01 --end-date 2025-12-31 --num-results 20
Full flag reference: references/flags.md.
search vs discover — pick the right one
| You want | Use |
|---|---|
| "What Google ranks for this exact keyword" | search |
| "Pages that match this meaning/intent" | discover |
| "News / images / shopping vertical SERP" | search --type <vertical> |
| "Results + page bodies in one call" | discover --include-content |
| "Dedup / semantic ranking across queries" | discover |
Verification gate
- JSON parses cleanly:
jq . <output>returns 0. - Result array non-empty — if empty, the query is legitimately zero-result; relax the query and re-run. Don't claim success on empty results without telling the user.
- Required fields present:
search: results live at.organic[]; each hastitle+linkdiscover: results live at.results[]; each hastitle+link; if--include-content, alsocontent
- For
discover --include-content: no block-page signatures in thecontentfield (same list as scrape, case-insensitive):Access DeniedJust a momentAttention RequiredChecking your browsercaptchacf-browser-verificationcloudflare(with < 2KB total body)
- Geo sanity: if the user expected country-specific results, inspect TLDs / languages of top results. If mis-localized, re-run with explicit
--countryand--language.
Red flags
- Using
searchto fetch content from Amazon, LinkedIn, TikTok, etc. whendata-feedsreturns clean structured data in one call. - Scraping every SERP result blindly — filter first (domain allowlist, keyword in title, relevance heuristic).
- Confusing
search(keyword) withdiscover(semantic). They answer different questions. - Running multiple queries without deduping URLs across result sets before scraping.
- Assuming SERP order is universal — it's personalized by geo + device. Always set
--countryand--deviceexplicitly for reproducibility. - Using
--pageas a result count — it's a page index, not a limit. Each page returns ~10 results. - Assuming SERP results are at
.results[]— forbdata searchthey live at.organic[]. (Discover uses.results[].) - Hardcoding
--num-results 100ondiscoverwithout realizing the pipeline polls until that many are found; can be slow.
References
references/flags.md— full flags forsearchanddiscoverwith when-to-use notes.references/patterns.md— multi-query dedup, SERP → filter → scrape pipeline,searchvsdiscoverdecision, legacycurlfallback, shared verification checklist.references/examples.md— (1) single Google query, (2) localized Bing, (3) batch queries + dedup into URL list, (4)discover --include-contentend-to-end.