Skip to content

Latest commit

 

History

History
212 lines (173 loc) · 19 KB

README.md

File metadata and controls

212 lines (173 loc) · 19 KB

The Hugging Face Agents Course

course-icon

Course Related Links:

Authors

  • Joffrey Thomas
  • Ben Burtenshaw
  • Thomas Simonini

Getting Started

# Make sure you have git-lfs installed (https://git-lfs.com)
brew install git-lfs
# For Hugging Face repository/space, now as submodules of this repository
git lfs install
# Login with token (to push code back to Hugging Face repository)
# https://discuss.huggingface.co/t/cant-push-to-new-space/35319/4
huggingface-cli login

# https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/
python -m venv .venv
source .venv/bin/activate

# https://huggingface.co/docs/transformers/installation
pip install -r requirements.txt
pip install -r spaces/Unit_1-First_Agent/requirements.txt

cp .env.example .env
# Setup Hugging Face API key
# https://hf.co/settings/tokens

Schedule

recommended-pace

publishing-date

Week Unit Topic Lectures Quiz Assignments Others
- 0 Welcome to the Course Welcome To The Agents Course! Introduction to the Course and Q&A - YouTube - - -
2025/2/10~2/16 1 Introduction to Agents - Unit 1 Quiz First Agent Unit 1 Notebook, Try Dummy Agent and smolagents
2025/2/17~2/23 Bonus Fine-tune your agent - - - -
2025/2/24~3/9 2 2_frameworks - - - -
2025/3/10~3/31 3 3_use_cases - - - -
2025/4/1~4/30 4 4_final_assignment_with_benchmark - - - -

Unit 0. Welcome to the Course

Welcome, guidelines, necessary tools, and course overview.

  1. Welcome to the 🤗 AI Agents Course
  2. Onboarding: Your First Steps ⛵
    1. Create your Hugging Face Account
    2. Sign up to Discord and introduce yourself
    3. Follow the Hugging Face Agents Course
    4. Spread the word about the course
  3. (Optional) Discord 101

Unit 1. Introduction to Agents

Definition of agents, LLMs, model family tree, and special tokens.

whiteboard-no-check

  1. Introduction to Agents
    • Understanding Agents
      • What is an Agent, and how does it work?
      • How do Agents make decisions using reasoning and planning?
    • The Role of LLMs (Large Language Models) in Agents
      • How LLMs serve as the “brain” behind an Agent.
      • How LLMs structure conversations via the Messages system.
    • Tools and Actions
      • How Agents use external tools to interact with the environment.
      • How to build and integrate tools for your Agent.
    • The Agent Workflow:
      • Think → Act → Observe.
  2. What is an Agent?
    • An Agent is a system that leverages an AI model to interact with its environment in order to achieve a user-defined objective. It combines reasoning, planning, and the execution of actions (often via external tools) to fulfill tasks.
      1. The Brain (AI Model)
        • LLM (Large Language Model): e.g. GPT4 from OpenAI, LLama from Meta, Gemini from Google, ...
        • VLM (Vision Language Model)
      2. The Body (Capabilities and Tools)
    • To summarize, an Agent is a system that uses an AI Model (typically a LLM) as its core reasoning engine, to
      • Understand natural language: Interpret and respond to human instructions in a meaningful way.
      • Reason and plan: Analyze information, make decisions, and devise strategies to solve problems.
      • Interact with its environment: Gather information, take actions, and observe the results of those actions.
  3. Small Quiz (ungraded) (Quick Quiz 1)
  4. What are LLMs?
  5. Messages and Special Tokens
  6. What are Tools?
    • A Tool should contain:
      • textual description of what the function does.
      • Callable (something to perform an action).
      • Arguments with typings.
      • (Optional) Outputs with typings.
    • The tool description is injected in the system prompt.
      • What the tool does
      • What exact inputs it expects
  7. Quick Self-Check (ungraded) (Quick Quiz 2)
  8. Understanding AI Agents through the Thought-Action-Observation Cycle
    • Agents work in a continuous cycle of: thinking (Thought) → acting (Act) and observing (Observe).
      1. Thought: The LLM part of the Agent decides what the next step should be.
      2. Action: The agent takes an action, by calling the tools with the associated arguments.
      3. Observation: The model reflects on the response from the tool.
  9. Thought: Internal Reasoning and the Re-Act Approach
    • ReAct (papers.cool): “Reasoning” (Think) with “Acting” (Act)
      • ReAct is a simple prompting technique that appends “Let’s think step by step” before letting the LLM decode the next tokens.
      • We have recently seen a lot of interest for reasoning strategies. This is what's behind models like Deepseek R1 or OpenAI's o1, which have been fine-tuned to "think before answering".
  10. Actions: Enabling the Agent to Engage with Its Environment
    • One key method for implementing actions is the Stop and Parse Approach. This method ensures that the agent’s output is structured and predictable:
      1. Generation in a Structured Format: The agent outputs its intended action in a clear, predetermined format (JSON or code).
      2. Halting Further Generation: Once the action is complete, the agent stops generating additional tokens. This prevents extra or erroneous output.
      3. Parsing the Output: An external parser reads the formatted action, determines which Tool to call, and extracts the required parameters.
    • An alternative approach is using Code Agents. The idea is: instead of outputting a simple JSON object, a Code Agent generates an executable code block—typically in a high-level language like Python.
      • Expressiveness: Code can naturally represent complex logic, including loops, conditionals, and nested functions, providing greater flexibility than JSON.
      • Modularity and Reusability: Generated code can include functions and modules that are reusable across different actions or tasks.
      • Enhanced Debuggability: With a well-defined programming syntax, code errors are often easier to detect and correct.
      • Direct Integration: Code Agents can integrate directly with external libraries and APIs, enabling more complex operations such as data processing or real-time decision making.
  11. Observe: Integrating Feedback to Reflect and Adapt
    • Observations are how an Agent perceives the consequences of its actions.
      • Collects Feedback: Receives data or confirmation that its action was successful (or not).
      • Appends Results: Integrates the new information into its existing context, effectively updating its memory.
      • Adapts its Strategy: Uses this updated context to refine subsequent thoughts and actions.
    • After performing an action, the framework follows these steps in order:
      1. Parse the action to identify the function(s) to call and the argument(s) to use.
      2. Execute the action.
      3. Append the result as an Observation.
  12. Dummy Agent Library (the hallucination issue => OpenAI model & slightly bigger model works)
  13. Let’s Create Our First Agent Using smolagents (failed to use course HfApiModel API endpoint => Currently use OpenAI model (local model might be too weak) or the slight bigger one same as the Dummy Agent part)
  14. Unit 1 Quiz
  15. Get your certificate
  16. Conclusion

unit1-certificate

Bonus Unit. Fine-tune your agent

Fine-tune a Agent to do function calling (aka to be able to call tools based on user prompt)

  1. Introduction - Hugging Face Agents Course
    1. Know how to Fine-Tune an LLM with Transformers
    2. Know how to use SFTTrainer to fine-tune Hugging Face model
  2. What is Function Calling? => Function-calling is a way for an LLM to take actions on its environment
  3. Let’s Fine-Tune your model for function-calling
  4. Conclusion

Unit 2.

Overview of smolagents, LangChain, LangGraph, and LlamaIndex.

Unit 3.

SQL, code, retrieval, and on-device agents using various frameworks.

Unit 4.

Automated evaluation of agents and leaderboard with student results.

Resources

Packages

Videos

Courses