The Hugging Face Agents Course

Course Related Links:

Official Website
- Welcome to the 🤗 AI Agents Course - Hugging Face Agents Course
Sign up
YouTube Playlist
GitHub
Discord

Authors

Joffrey Thomas
Ben Burtenshaw
Thomas Simonini

Getting Started

# Make sure you have git-lfs installed (https://git-lfs.com)
brew install git-lfs
# For Hugging Face repository/space, now as submodules of this repository
git lfs install
# Login with token (to push code back to Hugging Face repository)
# https://discuss.huggingface.co/t/cant-push-to-new-space/35319/4
huggingface-cli login

# https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/
python -m venv .venv
source .venv/bin/activate

# https://huggingface.co/docs/transformers/installation
pip install -r requirements.txt
pip install -r spaces/Unit_1-First_Agent/requirements.txt

cp .env.example .env
# Setup Hugging Face API key
# https://hf.co/settings/tokens

Schedule

Week	Unit	Topic	Lectures	Quiz	Assignments	Others
-	0	Welcome to the Course	Welcome To The Agents Course! Introduction to the Course and Q&A - YouTube	-	-	-
2025/2/10~2/16	1	Introduction to Agents	-	Unit 1 Quiz	First Agent	Unit 1 Notebook, Try Dummy Agent and `smolagents`
2025/2/17~2/23	Bonus	Fine-tune your agent	-	-	-	-
2025/2/24~3/9	2	2_frameworks	-	-	-	-
2025/3/10~3/31	3	3_use_cases	-	-	-	-
2025/4/1~4/30	4	4_final_assignment_with_benchmark	-	-	-	-

Unit 0. Welcome to the Course

Welcome, guidelines, necessary tools, and course overview.

Welcome To The Agents Course! Introduction to the Course and Q&A - YouTube (2025/2/13 00:00 UTC+8)
- Live 1: How the Course Works and First Q&A

Welcome to the 🤗 AI Agents Course
Onboarding: Your First Steps ⛵
(Optional) Discord 101

Unit 1. Introduction to Agents

Definition of agents, LLMs, model family tree, and special tokens.

Introduction to Agents
- Understanding Agents
  - What is an Agent, and how does it work?
  - How do Agents make decisions using reasoning and planning?
- The Role of LLMs (Large Language Models) in Agents
  - How LLMs serve as the “brain” behind an Agent.
  - How LLMs structure conversations via the Messages system.
- Tools and Actions
  - How Agents use external tools to interact with the environment.
  - How to build and integrate tools for your Agent.
- The Agent Workflow:
  - Think → Act → Observe.
What is an Agent?
- An Agent is a system that leverages an AI model to interact with its environment in order to achieve a user-defined objective. It combines reasoning, planning, and the execution of actions (often via external tools) to fulfill tasks.
  1. The Brain (AI Model)
    - LLM (Large Language Model): e.g. GPT4 from OpenAI, LLama from Meta, Gemini from Google, ...
    - VLM (Vision Language Model)
  2. The Body (Capabilities and Tools)
- To summarize, an Agent is a system that uses an AI Model (typically a LLM) as its core reasoning engine, to
  - Understand natural language: Interpret and respond to human instructions in a meaningful way.
  - Reason and plan: Analyze information, make decisions, and devise strategies to solve problems.
  - Interact with its environment: Gather information, take actions, and observe the results of those actions.
Small Quiz (ungraded) (Quick Quiz 1)
What are LLMs?
- tokenizer_config.json · HuggingFaceTB/SmolLM2-135M-Instruct at main
Messages and Special Tokens
- openai-python/chatml.md at release-v0.28.0 · openai/openai-python (ChatML template format)
- Chat Templates (chat_template is usually in the model's tokenizer)
- 🤗 Transformers
What are Tools?
- A Tool should contain:
  - A textual description of what the function does.
  - A Callable (something to perform an action).
  - Arguments with typings.
  - (Optional) Outputs with typings.
- The tool description is injected in the system prompt.
  - What the tool does
  - What exact inputs it expects
Quick Self-Check (ungraded) (Quick Quiz 2)
Understanding AI Agents through the Thought-Action-Observation Cycle
- Agents work in a continuous cycle of: thinking (Thought) → acting (Act) and observing (Observe).
  1. Thought: The LLM part of the Agent decides what the next step should be.
  2. Action: The agent takes an action, by calling the tools with the associated arguments.
  3. Observation: The model reflects on the response from the tool.
Thought: Internal Reasoning and the Re-Act Approach
- ReAct (papers.cool): “Reasoning” (Think) with “Acting” (Act)
  - ReAct is a simple prompting technique that appends “Let’s think step by step” before letting the LLM decode the next tokens.
  - We have recently seen a lot of interest for reasoning strategies. This is what's behind models like Deepseek R1 or OpenAI's o1, which have been fine-tuned to "think before answering".
Actions: Enabling the Agent to Engage with Its Environment
- One key method for implementing actions is the Stop and Parse Approach. This method ensures that the agent’s output is structured and predictable:
  1. Generation in a Structured Format: The agent outputs its intended action in a clear, predetermined format (JSON or code).
  2. Halting Further Generation: Once the action is complete, the agent stops generating additional tokens. This prevents extra or erroneous output.
  3. Parsing the Output: An external parser reads the formatted action, determines which Tool to call, and extracts the required parameters.
- An alternative approach is using Code Agents. The idea is: instead of outputting a simple JSON object, a Code Agent generates an executable code block—typically in a high-level language like Python.
  - Expressiveness: Code can naturally represent complex logic, including loops, conditionals, and nested functions, providing greater flexibility than JSON.
  - Modularity and Reusability: Generated code can include functions and modules that are reusable across different actions or tasks.
  - Enhanced Debuggability: With a well-defined programming syntax, code errors are often easier to detect and correct.
  - Direct Integration: Code Agents can integrate directly with external libraries and APIs, enabling more complex operations such as data processing or real-time decision making.
Observe: Integrating Feedback to Reflect and Adapt
- Observations are how an Agent perceives the consequences of its actions.
  - Collects Feedback: Receives data or confirmation that its action was successful (or not).
  - Appends Results: Integrates the new information into its existing context, effectively updating its memory.
  - Adapts its Strategy: Uses this updated context to refine subsequent thoughts and actions.
- After performing an action, the framework follows these steps in order:
  1. Parse the action to identify the function(s) to call and the argument(s) to use.
  2. Execute the action.
  3. Append the result as an Observation.
Dummy Agent Library (the hallucination issue => OpenAI model & slightly bigger model works)
- dummy_agent_library.ipynb · agents-course/notebooks at main (open in Google Colab) => Modified Version
- The chat method is the RECOMMENDED method to use in order to ensure a smooth transition between models
- If use text_generation method, we need to provide prompt (e.g. special tokens for the specific model) properly
Let’s Create Our First Agent Using smolagents (failed to use course HfApiModel API endpoint => Currently use OpenAI model (local model might be too weak) or the slight bigger one same as the Dummy Agent part)
- smolagents is a library that focuses on codeAgent, a kind of agent that performs “Actions” through code blocks, and then “Observes” results by executing the code.
- Introducing smolagents: simple agents that write actions in code.
- Agent process - YouTube
- Duplicate this space: First Agent Template - a Hugging Face Space by agents-course
  - Modify this incomplete code: app.py · agents-course/First_agent_template at main
  - First Agent Template - a Hugging Face Space by daviddwlee84
Unit 1 Quiz
Get your certificate
Conclusion

Bonus Unit. Fine-tune your agent

Fine-tune a Agent to do function calling (aka to be able to call tools based on user prompt)

Introduction - Hugging Face Agents Course
1. Know how to Fine-Tune an LLM with Transformers
2. Know how to use SFTTrainer to fine-tune Hugging Face model
  - Supervised Fine-tuning Trainer
What is Function Calling? => Function-calling is a way for an LLM to take actions on its environment
- Function calling and other API updates | OpenAI <= It has first been introduced in GPT-4, and was then reproduced in other models.
Let’s Fine-Tune your model for function-calling
- Google Gemma2 2B: Base Model; Instruction Fine-tuned
- bonus-unit1/bonus-unit1.ipynb · agents-course/notebooks at main
Conclusion

Unit 2.

Overview of smolagents, LangChain, LangGraph, and LlamaIndex.

Unit 3.

SQL, code, retrieval, and on-device agents using various frameworks.

Unit 4.

Automated evaluation of agents and leaderboard with student results.

Resources

Packages

smolagents

Videos

Deep Dive into LLMs like ChatGPT - YouTube

Courses

Hugging Face - Learn
- Hugging Face NLP Course

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

The Hugging Face Agents Course

Getting Started

Schedule

Unit 0. Welcome to the Course

Unit 1. Introduction to Agents

Bonus Unit. Fine-tune your agent

Unit 2.

Unit 3.

Unit 4.

Resources

Files

README.md

Latest commit

History

README.md

File metadata and controls

The Hugging Face Agents Course

Getting Started

Schedule

Unit 0. Welcome to the Course

Unit 1. Introduction to Agents

Bonus Unit. Fine-tune your agent

Unit 2.

Unit 3.

Unit 4.

Resources