Week 7 Progress Update

정원 배
Jun 13
2 min read

Voice Activated, Voice Responded: Our AI Assistant Now Talks and Listens!

This week, we brought two-way audio interaction to life. Our assistant can now listen to spoken commands and respond out loud — a major step toward making our system hands-free and elderly-friendly.

Speech-to-Text (STT): “Hold to Speak”

We’ve successfully integrated speech recognition into our Kivy UI. Users can now:

Hold the microphone button
Speak a command like:“Weather in Tokyo” or “Play music by Adele”
Release the button, and the system writes the transcribed command in the input box.

This replaces typing for users who prefer to speak — a vital accessibility feature for our target audience.

Text-to-Speech (TTS): “Let It Read For You”

We also implemented text-to-speech for key content:

After fetching news, the assistant reads out the top headline and summary.
This supports users with limited vision or reading fatigue.

Now, our assistant feels more human and responsive, with voice being both the input and the output.

Smooth Multi-Modal Interaction

With both STT and TTS in place, the user experience is now streamlined:

Speak your request
Watch it appear in the text box
Press “Request”
Get news, weather, or music results — and hear them read aloud

The full loop supports natural conversation-style interaction, making the system feel intuitive and familiar.

Still in Progress: Ask AI & Hardware Audio

We’re preparing to:

Integrate the “Ask AI” button with IBM Watson Chatbot, enabling one-question answers like:“What’s the capital of Italy?” or “How do I make tea?”
Set up the microphone and speaker on the Raspberry Pi, so everything works smoothly without the laptop

These will be the focus of Week 8.

IBM Design Thinking in Action

Every new feature has been developed with user accessibility and technical simplicity in mind:

Large, readable buttons
Voice control to reduce reliance on touch or typing
Clear audio responses to reinforce understanding

We’re applying IBM’s user-centered design philosophy as we refine each layer.

🔜 Coming Up Next

Complete Watson Chatbot integration for “Ask AI”
Finalize microphone/speaker setup on Raspberry Pi
Prepare for project demos and testing
Add polish to the UI aesthetics