Building a Multi-Agent Research Team with Google ADK, Tavily Search, and Crawl4AI

Learn how to create a team of specialized AI agents using Google's Agent Development Kit (ADK) to search the web, analyze URLs, and summarize content in a collaborative workflow.

Building a Multi-Agent Research Team with Google ADK, Tavily Search, and Crawl4AI

After getting started with Google’s Agent Development Kit (ADK), the next step in harnessing its power is building multi-agent systems where specialized AI agents collaborate to accomplish more complex tasks. ADK’s architecture makes it perfect for creating agent teams where each member has a specific role, and they work together through a coordinator agent that orchestrates the workflow.

In this tutorial, we’ll build a sophisticated research team of AI agents using Google ADK. Our agent team will be capable of searching the internet with Tavily Search via LangChain, extracting and analyzing content from URLs with Crawl4AI, and summarizing findings—all working together through a coordinator agent that manages the collaborative process. We’ll also set up our system to work with the built-in adk web tool for a seamless interactive experience.

What We’ll Build: The Research Assistant Team

We’ll create a multi-agent system with four specialized agents working together:

AgentRoleToolsSpecial Abilities
CoordinatorTeam leader that manages the workflow and delegates tasks to the right specialistAgent-as-a-Tool for all specialistsUnderstands which agent to call based on user query
Search AgentHandles web searches to find relevant informationTavily Search (LangChain)Returns structured search results with links and snippets
Content ExtractorAnalyzes web pages in depthCrawl4AI (custom tool)Extracts, parses, and structures content from URLs
SummarizerCondenses information into clear summariesNone (pure LLM reasoning)Creates concise, accurate summaries at different lengths

This structure follows the “Coordinator/Dispatcher” pattern with the Coordinator as the central agent that routes tasks to specialists, which are integrated as tools rather than using LLM-driven delegation. Let’s build each component step by step.

Prerequisites

Before starting, make sure you have:

  1. Python 3.9+ installed
  2. A Google AI Studio API key (from aistudio.google.com)
  3. A Tavily API key (from tavily.com)
  4. Familiarity with the basic ADK concepts covered in our first tutorial

Setting Up Your Project

First, let’s set up our project structure and dependencies:

# Create project directory
mkdir adk-research-team
cd adk-research-team

# Create virtual environment with uv
uv init
uv venv

Now let’s install the necessary packages:

# Install required packages
uv add google-adk langchain-community tavily-python python-dotenv crawl4ai

Create a .env file in your project root to store your API keys:

# Create .env file for API keys
touch .env

Add your API keys to the .env file:

# .env file
GOOGLE_API_KEY=your_google_ai_studio_api_key
TAVILY_API_KEY=your_tavily_api_key
GOOGLE_GENAI_USE_VERTEXAI="False"

Project Structure

To ensure our system works properly with adk web, we need to set up our project with a specific file structure:

adk-research-team/
├── .env                       # API keys
└── agent_module/              # Agent package
    ├── __init__.py            # Makes it a proper Python package
    ├── agent.py               # Main file with root_agent
    ├── search_agent.py        # Search specialist
    ├── content_extractor.py   # Content analysis specialist
    └── summarizer.py          # Summarization specialist

First, create the agent_module directory and initialize it as a Python package:

mkdir -p agent_module
touch agent_module/__init__.py

Add the following to agent_module/__init__.py:

# This file makes agent_module a proper Python package
# The import below ensures that agent.py is accessible
from . import agent

Creating Our Specialized Agents

Let’s implement each of our specialized agents one by one, starting with the most fundamental ones and working our way up to the coordinator.

touch agent_module/search_agent.py

Now, let’s implement the search agent:

from dotenv import load_dotenv
import os
from google.adk import Agent
from google.adk.tools.langchain_tool import LangchainTool
from langchain_community.tools import TavilySearchResults

# Load environment variables
load_dotenv()

def create_search_agent():
    """
    Creates an agent specialized in web searching using Tavily Search.
    """
    # Check if API key is available
    tavily_api_key = os.getenv("TAVILY_API_KEY")
    if not tavily_api_key:
        raise ValueError("TAVILY_API_KEY not found in environment variables")

    # Create Tavily Search tool with LangChain
    tavily_tool_instance = TavilySearchResults(
        max_results=5,  # Return 5 results per search
        search_depth="advanced",  # Use advanced search for more comprehensive results
        include_answer=True,  # Include a direct answer when possible
        include_raw_content=True,  # Include the raw content from search results
        include_images=False  # Don't include images in the results
    )

    # Wrap the LangChain tool for ADK
    adk_tavily_tool = LangchainTool(tool=tavily_tool_instance)

    # Create and return the search agent
    search_agent = Agent(
        name="search_agent",
        model="gemini-2.0-flash",
        description="A specialized agent that searches the web for information using Tavily Search API.",
        instruction="""You are a web research specialist.

        When asked to find information about a topic, craft an effective search query and use the TavilySearchResults tool.

        After receiving search results:
        1. Parse the response which may contain a direct answer and multiple search results.
        2. Format the results in a clear, structured way, with each result showing the title, link, and a brief preview of the content.
        3. Highlight the most relevant results based on the original query.
        4. If Tavily provided a direct answer, present that first as the most likely answer.

        If the search doesn't return useful results, suggest refined search terms for a follow-up search.

        Avoid making up information - only report what is found in the search results.
        """,
        tools=[adk_tavily_tool]
    )

    return search_agent

2. The Content Extractor with Crawl4AI

touch agent_module/content_extractor.py

Now, let’s implement the content extractor tool and agent with a fix for the CrawlResult title attribute issue:

from dotenv import load_dotenv
import os
import asyncio
from crawl4ai import AsyncWebCrawler
from google.adk import Agent
from google.adk.tools import FunctionTool
from google.adk.tools.tool_context import ToolContext

# Load environment variables
load_dotenv()

async def extract_content_from_url(url: str, include_headers: bool = True, tool_context: ToolContext = None) -> dict:
    """
    Extracts content from a URL using Crawl4AI.

    Args:
        url (str): The URL to extract content from.
        include_headers (bool): Whether to include headings in the extraction.
        tool_context (ToolContext, optional): Tool context for ADK.

    Returns:
        dict: A dictionary containing the extracted content, metadata, and status.
    """
    try:
        # Create an instance of AsyncWebCrawler
        async with AsyncWebCrawler() as crawler:
            # Run the crawler on the URL
            result = await crawler.arun(
                url=url,
                include_headers=include_headers
            )

            # Create a structured response
            # FIX: Get the page title safely from the result
            page_title = getattr(result, 'title', None)
            if page_title is None:
                # Try to extract title from markdown or use URL as fallback
                page_title = url.split('/')[-1] if '/' in url else url

                # Try to find title in markdown if available
                if hasattr(result, 'markdown') and result.markdown:
                    # Look for # headers in markdown
                    lines = result.markdown.split('\n')
                    for line in lines:
                        if line.startswith('# '):
                            page_title = line.replace('# ', '')
                            break

            response = {
                "status": "success",
                "title": page_title,
                "url": url,
                "markdown_content": result.markdown[:10000] if hasattr(result, 'markdown') else "No content extracted",  # Limit content to 10k chars
                "content_length": len(result.markdown) if hasattr(result, 'markdown') else 0,
                "headers": [h.text for h in result.headers] if hasattr(result, 'headers') and include_headers else [],
                "word_count": len(result.markdown.split()) if hasattr(result, 'markdown') else 0,
                "has_truncated_content": len(result.markdown) > 10000 if hasattr(result, 'markdown') else False
            }

            # Optional: Store the full content in session state for the summarizer to use
            if tool_context and hasattr(result, 'markdown'):
                tool_context.state[f"extracted_content_{url}"] = result.markdown

            return response
    except Exception as e:
        return {
            "status": "error",
            "url": url,
            "error_message": str(e)
        }

def create_content_extractor_agent():
    """
    Creates an agent specialized in extracting and analyzing content from URLs.
    """
    # Create the FunctionTool for content extraction
    extract_content_tool = FunctionTool(func=extract_content_from_url)

    # Create and return the content extractor agent
    extractor_agent = Agent(
        name="content_extractor",
        model="gemini-2.0-flash",
        description="A specialized agent that extracts and analyzes content from web pages using Crawl4AI.",
        instruction="""You are a web content analysis specialist.

        When given a URL, use the extract_content_from_url tool to fetch and analyze its content.

        After extracting content:
        1. Report the page title and basic metadata (word count, if content was truncated).
        2. List the main headers to provide an overview of the page structure.
        3. Highlight key information found in the content that's most relevant to the user's request.
        4. Note if there were any errors during extraction.

        When extracting content from multiple URLs, organize the information clearly by URL.

        If the extraction fails, explain the error and suggest possible solutions.
        """,
        tools=[extract_content_tool]
    )

    return extractor_agent

3. The Summarizer Agent

touch agent_module/summarizer.py
from dotenv import load_dotenv
from google.adk import Agent
from google.adk.tools.tool_context import ToolContext

# Load environment variables
load_dotenv()

def summarize_content(content: str, summary_length: str = "medium", tool_context: ToolContext = None) -> dict:
    """
    Tool to save content for the agent to summarize.

    This doesn't actually summarize the content itself - it just saves the content
    to the session state so the LLM can access it directly.

    Args:
        content (str): The text content to summarize.
        summary_length (str): The desired length of the summary ("short", "medium", or "long").
        tool_context (ToolContext, optional): Tool context for ADK.

    Returns:
        dict: A dictionary with status information.
    """
    if tool_context:
        tool_context.state["content_to_summarize"] = content
        tool_context.state["requested_summary_length"] = summary_length

        # Calculate some metrics on the content
        word_count = len(content.split())

        return {
            "status": "success",
            "message": f"Content saved for summarization (word count: {word_count}). Ready to generate a {summary_length} summary.",
            "word_count": word_count
        }
    else:
        return {
            "status": "error",
            "message": "Tool context not available. Cannot store content."
        }

def create_summarizer_agent():
    """
    Creates an agent specialized in summarizing content.
    """
    # Create a simple tool to save content for summarization
    summarize_content_tool = summarize_content

    # Create and return the summarizer agent
    summarizer_agent = Agent(
        name="summarizer",
        model="gemini-2.0-flash",
        description="A specialized agent that summarizes content at various detail levels.",
        instruction="""You are a professional content summarizer.

        First, use the summarize_content tool to load the content into memory.

        Then, summarize the content stored in state['content_to_summarize'] according to the requested length in state['requested_summary_length']:

        - "short": A concise summary in 1-3 sentences, capturing only the essential point.
        - "medium": A balanced summary in 1-3 paragraphs, covering key points and some supporting details.
        - "long": A comprehensive summary in multiple paragraphs, preserving nuances and important contexts.

        Always structure summaries with clear headings and bullet points when appropriate.

        Prioritize accuracy over brevity - never include information not found in the original text.

        For technical or complex content, preserve the key terminology used in the original text.
        """,
        tools=[summarize_content_tool]
    )

    return summarizer_agent

4. The Main Agent File with Root Agent

Finally, let’s create our main agent file that will expose the root_agent to ADK web:

touch agent_module/agent.py
from dotenv import load_dotenv
import os
from google.adk import Agent
from google.adk.tools.agent_tool import AgentTool

# Import our specialized agents
from agent_module.search_agent import create_search_agent
from agent_module.content_extractor import create_content_extractor_agent
from agent_module.summarizer import create_summarizer_agent

# Load environment variables
load_dotenv()

# Create instances of our specialized agents
search_agent = create_search_agent()
extractor_agent = create_content_extractor_agent()
summarizer_agent = create_summarizer_agent()

# Wrap the specialized agents as tools
search_tool = AgentTool(agent=search_agent)
extractor_tool = AgentTool(agent=extractor_agent)
summarizer_tool = AgentTool(agent=summarizer_agent)

# Define the root agent that ADK web will use
root_agent = Agent(
    name="research_coordinator",
    model="gemini-2.0-flash",
    description="A coordinator agent that manages a team of specialized research agents.",
    instruction="""You are the coordinator of a research assistant team with specialized agents.

    Your team includes:
    - search_agent: Finds information on the web using Tavily Search.
    - content_extractor: Analyzes and extracts content from specific URLs.
    - summarizer: Creates concise summaries of content at different levels of detail.

    Based on user requests, delegate tasks to the appropriate specialist:

    1. If the user needs to find information on a topic, use the search_agent.
    2. If the user provides a specific URL or wants to analyze a web page, use the content_extractor.
    3. If the user needs a summary of content, use the summarizer.
    4. For complex research tasks, coordinate multiple agents in sequence:
       - First search for relevant information
       - Then extract detailed content from the most promising URLs
       - Finally summarize the findings

    Always present the results from your specialists in a clear, organized manner.
    When coordinating multi-step research, explain the research process to the user.

    Remember: You are responsible for the final response to the user, so ensure it fully addresses their request.
    """,
    tools=[search_tool, extractor_tool, summarizer_tool]
)

Using the ADK Web Interface

Now that we’ve created our multi-agent system and fixed the content extractor issue, let’s use the ADK built-in web interface to interact with it. First, make sure you’re in the right directory (where your .env file is located), then run:

adk web

This will start a local web server and provide you with a URL (typically http://localhost:8000 or http://127.0.0.1:8000). Open this URL in your browser to access the ADK web interface.

In the interface, you’ll be presented with a list of available agents. You should see the “research_coordinator” agent listed. Select it and you’ll be able to chat with your multi-agent research team through the clean interface.

Here are some example queries to try:

  1. Search for Information: “What are the latest advancements in quantum computing?”
  2. Analyze a URL: “Can you analyze the content at https://www.reuters.com/technology/ and tell me what it’s about?”
  3. Get a Summary: “Summarize this article in a medium length: [paste article text]”
  4. Multi-step Research: “Research the environmental impact of electric vehicles, analyze the top result in detail, and provide a short summary of the key findings.”

How the System Works

Let’s break down how our multi-agent system functions:

  1. The Coordinator as Central Hub:

    • All user requests go to the coordinator agent
    • The coordinator analyzes the request and decides which specialist to call
    • It uses the AgentTool wrapper to “call” specialist agents as if they were functions
  2. Communication Via Session State:

    • Agents share information through the session state dictionary
    • For example, when the content extractor analyzes a URL, it stores the full content in the session state with a key like extracted_content_{url}
    • Later, the summarizer can access this content without having to download it again
  3. Using LangChain Tools in ADK:

    • The LangchainTool wrapper makes it easy to use the Tavily Search functionality in our ADK agent
    • This demonstrates ADK’s flexibility in integrating with other AI frameworks
  4. Custom Tool Implementation:

    • Our Crawl4AI integration shows how to create custom tools with FunctionTool
    • The improved error handling in the extract_content_from_url function ensures it works reliably even with different versions of Crawl4AI

Extending the System

The modular design of this agent team makes it easy to extend with new capabilities:

Potential EnhancementsImplementation Approach
PDF Document AnalysisAdd a new specialist agent with a PDF parsing tool
Translation ServicesCreate a translation agent with a language API tool
Data VisualizationAdd an agent that can generate charts from extracted data
Sentiment AnalysisImplement a specialized agent for analyzing sentiment in content
Citation ManagementAdd a tool to extract and format citations from academic sources

Troubleshooting Common Issues

Here are some common issues you might encounter when working with ADK web and multi-agent systems:

IssueSolution
Agent Not Showing in ADK WebEnsure you have a root_agent variable defined in your agent.py file and that your __init__.py properly imports it with from . import agent.
”No attribute ‘title’” ErrorThis has been fixed by adding fallback logic in the extract_content_from_url function to handle missing attributes in the CrawlResult object.
Import ErrorsEnsure you’re using the correct import structure (from google.adk import Agent).
API Key IssuesDouble-check that your .env file is in the right location (at the project root, not inside agent_module) and that keys are loaded properly with load_dotenv().
Tool Not FoundMake sure tool functions are properly wrapped (LangchainTool, FunctionTool) and added to the agent’s tools list.
AttributeError with Crawl4AIAlways use hasattr() checks before accessing attributes of third-party library objects to handle API changes gracefully.

Conclusion

In this tutorial, we’ve built a sophisticated multi-agent research system using Google’s Agent Development Kit. By combining specialized agents for web search with Tavily, content extraction, and summarization, we’ve created a powerful team that can perform complex research tasks with a clear division of responsibilities.

One of the great advantages of using ADK is the ability to quickly test and interact with your agents using the built-in adk web tool, which provides a clean, intuitive interface without requiring you to build a frontend. This makes development and testing significantly faster.

We’ve also learned how to handle potential issues with third-party libraries like Crawl4AI by building robust error handling into our tools. This makes our multi-agent system more reliable and maintainable.

This architecture demonstrates several key ADK concepts:

  • Using the Coordinator/Dispatcher pattern for agent orchestration
  • Integrating third-party tools from LangChain (Tavily Search)
  • Creating custom tools with FunctionTool
  • Implementing the Agent-as-a-Tool pattern with AgentTool
  • Sharing information through session state
  • Exposing your agent for use with adk web
  • Building robust error handling for third-party dependencies

As you continue exploring ADK, consider how this multi-agent approach could solve other complex problems by breaking them down into specialized components that work together seamlessly. The possibilities are endless, from customer service automation to data analysis pipelines and beyond.

Happy building with Google ADK!

Related Posts