Gemini-Speaker is a Python-based English-speaking training assistant powered by Google's Gemini API. It provides real-time feedback on pronunciation, grammar, and sentence structure while guiding users step-by-step to improve their spoken English.
-
Real-Time Voice Input:
- Captures your voice using the microphone and processes it seamlessly.
-
AI Feedback:
- Corrects pronunciation mistakes.
- Highlights grammar errors.
- Provides suggestions for improvement.
-
Interactive Training Loop:
- Suggests contextual sentences for continuous practice.
- Listens, evaluates, and progresses the training dynamically.
-
Seamless User Experience:
- Runs in your terminal with interactive outputs.
- Python 3.11 or higher
- Pip
- Google API Key
- Required Libraries: See
requirements.txt
.
-
Clone the repository:
git clone https://github.com/Xudong-Mao/Gemini-Speaker.git cd Gemini-Speaker
-
Install dependencies:
pip install -r requirements.txt
-
Set up the environment variables:
- Create a
.env
file in the root directory. - Add your Google API key:
GOOGLE_API_KEY=your_google_api_key_here
- Create a
-
Run the program:
python main.py
-
Follow the on-screen instructions:
- Speak an English sentence (e.g., "What is blockchain?").
- Receive AI feedback and suggestions.
-
Say "OK, 我要退出" to exit the program.
🎤 说一句英语吧!比如: What is blockchain?
User: What is block chain?
🤖 =============================================
你说的句子是: "What is block chain?"
发音错误: "block chain" 应发 /ˈblɒkˌtʃeɪn/。
语法提示: 词汇拼写正确,注意连读发音。
请再试一次!
- Python >= 3.11
- pyaudio
- websockets
- python-dotenv
- rich
This project is licensed under the MIT License.