If you’ve been paying attention to how AI search is evolving, you’ve probably heard the term “fanout queries” floating around the GEO and SEO community. But knowing the concept is one thing — actually identifying the fanout queries that matter for your brand is another.
In this post, I’ll walk you through a practical method I’ve built to extract fanout queries at scale using the Gemini API, a simple Python script, and a free Jupyter notebook. By the end, you’ll have a repeatable workflow to uncover the hidden sub-queries that LLMs generate behind the scenes — and a clear path to turning those insights into content that improves your AI search visibility.
What Are Fanout Queries and Why Do They Matter for SEO?
When you type a prompt into an LLM like Gemini, ChatGPT, or Perplexity, the model doesn’t just match your exact words to a web page. Instead, it breaks your question down into a series of related sub-queries — each targeting a specific angle, entity, or intent — and then retrieves information from multiple sources to assemble a comprehensive answer.
These sub-queries are called fanout queries (also referred to as “query fan-out”).
For example, if you ask an LLM “What’s the best electric car model in 2026?”, the model might silently generate sub-queries such as:
- electric car models 2026
- best electric cars comparison
- electric car range and pricing 2026
- electric vehicle showroom 2026
- top-rated EVs consumer reports
The LLM then searches the web (or its internal knowledge) for each of these, pulls the most relevant content, and synthesizes it all into a single response.
Why this matters for SEO and GEO: your content isn’t just competing for the original prompt. It’s competing across every sub-query the AI generates. If you only have content targeting the main keyword but nothing addressing the related fanout queries, your brand may never appear in the AI-generated answer — even if you rank well in traditional search for the primary term.
This is the core principle behind Generative Engine Optimization (GEO): optimizing not just for one keyword, but for the entire cluster of queries and intents that an LLM explores when answering a prompt.
The Problem: You Can’t Optimize for Queries You Don’t Know About
Traditional SEO keyword research gives you search volume, difficulty scores, and related terms based on what humans type into Google. But fanout queries are different. They’re generated internally by the LLM’s reasoning process, and they often include phrasing, combinations, or angles you wouldn’t find in a standard keyword tool.
That creates a visibility gap: you might have great content for the surface-level query, but you’re missing content for the sub-queries where the LLM actually pulls its sources.
To close that gap, you need to reverse-engineer the fanout process — and that’s exactly what the script I’m about to share does.
How to Extract Fanout Queries Using the Gemini API
Here’s the approach I’ve built and tested. It uses a Python script that sends your prompts to the Gemini API and asks the model to identify the fanout queries it would generate when answering each prompt.
What You’ll Need
- A Gemini API key (available through Google AI Studio)
- A Jupyter notebook — I use Google Colab (also known as Jupyter in the cloud) because it’s free, runs in the browser, and requires no local setup. You can also run it locally in any Python environment. You can find the script I used here. If you encounter any issues, I suggest you paste the code in Claude and ask to modify, it was it that created it after all
- Do not forget to insert you API key and add the prompt in the correct section
Step-by-Step Walkthrough
Step 1: Prepare your prompts. Start by compiling a list of prompts that represent the types of questions your target audience might ask an LLM. These should be the kind of conversational, intent-rich queries that people type into ChatGPT, Gemini, or Perplexity — not just short-tail keywords.
For example, if you’re in the electric vehicle space, your prompt list might include:
- “What’s the best electric car model in 2026?”
- “Are electric cars worth buying in Europe right now?”
- “Which EV has the longest range for under €40,000?”
Step 2: Run the script. The Python script takes each prompt, sends it to the Gemini API, and asks the model to return the set of sub-queries (fanout queries) it would use to research and answer that prompt. You simply paste your API key, define your prompts in the script, and hit play.
Step 3: Review the output. For each prompt, the script returns a list of fanout queries. In the notebook, you’ll see the results printed directly. But for practical use — especially when you’re working with 50 or 100+ prompts — reading through the raw output isn’t efficient.
Step 4: Export to CSV. The script automatically generates a CSV file that categorizes all fanout queries by their original prompt. You’ll find this in the file browser of your Jupyter environment. Download it and you have a structured dataset ready for analysis.
Example Output
For the prompt “What’s the best electric car model in 2026?”, the Gemini API might return fanout queries like:
| Prompt | Fanout Query |
|---|---|
| What’s the best electric car model in 2026? | electric car models 2026 |
| What’s the best electric car model in 2026? | best electric cars ranking |
| What’s the best electric car model in 2026? | electric car showroom 2026 |
| What’s the best electric car model in 2026? | EV range comparison 2026 |
| What’s the best electric car model in 2026? | electric car reviews consumer |
How to Turn Fanout Query Data Into Actionable SEO and GEO Insights
Having a CSV of fanout queries is useful, but the real value comes from analyzing the data to find patterns, gaps, and opportunities. Here’s how to do that efficiently.
Identify Content Gaps
Once you’ve exported your fanout queries, upload the CSV into an AI assistant like Claude or ChatGPT and ask it to:
- Identify patterns and recurring themes across your fanout queries
- Cross-reference with your existing content to find topics you haven’t covered
- Flag high-opportunity keywords — fanout queries that appear frequently but where you have no content
You can also cross-reference the fanout queries against your Semrush or Ahrefs data to check which of these terms you’re already ranking for in traditional search and which ones represent net-new opportunities.
Prioritize by Frequency and Relevance
Not every fanout query is equally important. Look for queries that:
- Appear across multiple prompts — these are likely core sub-topics that LLMs consistently explore
- Align with your product or service — prioritize queries where your brand can provide authoritative, first-hand answers
- Have existing search volume — fanout queries that also have traditional search demand give you a double benefit: visibility in both AI answers and organic search results
Build Content Around Query Clusters
Instead of creating one page per fanout query, think in terms of content clusters. Group related fanout queries together and create comprehensive content that addresses the full cluster. This approach aligns with how LLMs actually work: they’re looking for content that covers a topic in depth, not isolated pages targeting narrow variations.
For example, if your fanout analysis reveals queries like “EV battery lifespan 2026,” “electric car maintenance costs,” and “EV total cost of ownership,” those could all be addressed in a single, thorough guide on the economics of owning an electric vehicle.
Why This Approach Works for GEO
Generative Engine Optimization is fundamentally about ensuring your brand appears in AI-generated answers. And AI-generated answers are built from fanout queries. By reverse-engineering the fanout process, you’re effectively seeing the same research the LLM does — and you can proactively create (or optimize) content for each of those research threads.
This approach gives you several advantages:
- You stop guessing which topics matter for AI visibility and start working with actual data from the LLM’s reasoning process.
- You discover angles you’d miss with traditional keyword research — fanout queries often surface adjacent topics, comparisons, and entity-level questions that standard tools don’t highlight.
- You build topical authority by systematically covering the full intent space around your core topics, making it more likely that your content gets cited across multiple fanout paths.
Scaling the Process
The example I’ve shown uses just three prompts, but the real power of this approach comes from scale. When you run 50, 100, or even 200 prompts through the script, you start seeing macro-level patterns that reveal your brand’s true content gaps in the context of AI search.
At that scale, manually reviewing the data isn’t practical — which is exactly why the CSV export and AI-assisted analysis step is essential. Upload the full dataset, ask for a summary of themes and gaps, and you’ll have a prioritized content roadmap in minutes.