Robots are becoming more intelligent, more autonomous, and more interactive. In this project, we combined the LeKiwi Robot platform with reSpeaker Flex to transform a Kiwi-drive robot into a voice-interactive embodied AI system capable of far-field wake word detection, natural language voice commands, real-time movement control, simultaneous audio playback and speech recognition, and hands-free human-robot interaction. By integrating voice AI with robotics, the system allows users to naturally speak to the robot and receive real-time responses in a much more intuitive and interactive way.

Why add the reSpeaker Flex microphone array to LeKiwi?

LeKiwi is already a highly flexible robotics platform. But after integrating reSpeaker Flex, the robot gains something fundamentally different: A true voice interaction interface.

This goes far beyond attaching a USB microphone.

reSpeaker Flex 4-microphone array provides onboard acoustic intelligence, including:

Acoustic Echo Cancellation (AEC)
Beamforming
Noise Suppression (NS)
Direction of Arrival (DoA)
Far-field voice pickup

These capabilities allow the robot to continue understanding voice commands even while:

Playing music
Moving with motor noise
Operating in noisy environments

This is one of the key challenges in embodied AI: How can robots reliably “hear” humans in the physical world?

reSpeaker Flex addresses this by turning audio into a perception layer rather than simple sound capture.

What Can the LeKiwi Robot Do?

Once integrated, the LeKiwi Robot becomes a fully voice-controlled robotic system.

Users can interact naturally through commands such as:

“Hey Jarvis, move forward”
“Turn left”
“Strafe right”
“Stop”

The system supports:

Wake word activation
Speech-to-text transcription
LLM-based command understanding
Voice feedback through TTS
Real-time motor control

This creates a much more natural interaction loop between humans and robots.

Instead of pressing buttons or manually controlling movement, users simply speak to the robot.

System Architecture

The project combines multiple AI and robotics components into a single voice pipeline.

Hardware

The setup includes:

AI Pipeline

The interaction flow works as follows:

Wake Word → Speech Recognition → LLM Reasoning → TTS Response → Robot Movement

This architecture combines voice AI and robotics into a single embodied interaction system.

Why Far-Field Voice Pickup Matters

One of the most impressive aspects of the project is its far-field voice interaction capability.

In testing, the robot successfully responded to wake words and commands from distances of 5–7 meters.

This is extremely important for real robotics deployments because users are rarely standing directly next to the robot.

Far-field pickup allows robots to:

Hear commands across rooms
Interact naturally while moving
Support hands-free operation
Maintain responsiveness in dynamic environments

Combined with beamforming and noise suppression, reSpeaker Flex enables significantly more reliable interaction than conventional microphones.

Real-World Robotics Challenges

Voice interaction becomes much harder once robots begin moving and speaking simultaneously.

Robots generate noise themselves. Motors spin. Speakers play audio feedback. Users interact from different distances and directions. In real-world environments, robots also need to handle far-field voice pickup, sufficiently powerful onboard speaker output for clear conversational feedback across different positions, room reverberation, background conversations, and acoustically complex spaces with reflections or partial occlusions.

Without proper acoustic processing:

The robot hears its own speaker output
Motor noise interferes with ASR
Wake word accuracy drops
Speech recognition becomes unstable

In these scenarios, stable and natural human-robot interaction requires much more than simple audio input, it requires spatial audio perception, robust far-field voice capture, and real-time acoustic processing.

That’s why acoustic algorithms like beamforming, Acoustic Echo Cancellation (AEC), noise suppression, and Direction of Arrival (DoA) become essential for embodied AI systems.

With onboard AEC, reSpeaker Flex can suppress the robot’s own playback audio while preserving user speech.

Combined with beamforming, noise suppression, and far-field voice pickup, reSpeaker Flex enables the robot to continue understanding voice commands even while:

Playing music
Moving with motor noise
Operating in noisy environments

This allows the robot to:

Play music
Speak through TTS
Continue listening at the same time

This full-duplex interaction model is essential for natural embodied AI experiences.

More Than Just a Robot Demo

Projects like this demonstrate a broader trend in robotics:

Robots are evolving from passive machines into interactive AI systems.

Voice becomes one of the most natural interfaces for embodied AI because it allows:

Hands-free interaction
Low-latency control
Natural communication
Multimodal perception

And microphone arrays are becoming increasingly important as the “hearing system” behind that interaction.

Rather than simply recording audio, systems like reSpeaker Flex provide:

Spatial hearing
Acoustic perception
Noise-robust interaction
Real-time audio intelligence

This is what enables robots to operate more naturally in human environments.

Start Building Your Own Voice Robot

This project provides a powerful starting point for developers interested in:

Embodied AI
Voice-enabled robotics
Local AI agents
Human-robot interaction
Edge AI audio systems

By combining LeKiwi with reSpeaker Flex, developers can rapidly prototype robots capable of natural speech interaction and real-world responsiveness.

Whether you’re building:

AI companion robots
Educational robotics
Interactive assistants
Smart service robots
Experimental embodied AI systems

voice interaction is quickly becoming a core part of the experience.

And it starts with giving robots the ability to hear intelligently.

Get the reSpeaker Flex here: reSpeaker Flex 4-Mic Array: Split Type For Embodied AI | Seeed Studio

Get the LeKiwi Kit here: LeKiwi Full Kit (12V Verision)

Check the LeKiwi + reSpeaker Flex integration wiki here: Add Voice Interaction to Your LeKiwi Robot with reSpeaker Flex | Seeed Studio Wiki

About Author

Elena Tang

See author's posts

Tags: embodied AI, ESP32S3, LeKiwi, Mic Array, microphone array, respeaker, reSpeaker Flex, robot, robotic, Sound AI, Voice AI, voice command, Voice Control, Voice interaction, XIAO, XIAO ESP32S3, XMOS, XMOS XVF3800, XVF3800

2 thoughts on “Add Voice Interaction to LeKiwi Robot with reSpeaker Flex”

Anonymous says:

May 31, 2026 at 2:34 am

Is your reSpeaker Flex upside down? Compared to the video and mic holes, it appears to be upside down.

1. Elena Tang says:
  
  June 1, 2026 at 1:59 am
  
  Yes, you’re absolutely right, and thank you for pointing that out. You have a very sharp eye!
  The microphone holes on reSpeaker Flex are intentionally placed on the back side of the PCB. This design allows the microphones to be positioned as close as possible to the sound source in embedded applications, while also minimizing the impact of other electronic components on the front side of the PCB.
  In this particular LeKiwi setup, however, the reSpeaker Flex is mounted externally and elevated above the robot chassis, so the orientation has very little practical impact on voice pickup performance.
  That said, your observation is completely correct. For desktop robots and similar applications, it is generally better to mount the reSpeaker Flex with the microphone holes facing upward or forward to achieve the best possible acoustic performance.
  Thank you again for catching that and bringing it to our attention!

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Add Voice Interaction to LeKiwi Robot with reSpeaker Flex

Why add the reSpeaker Flex microphone array to LeKiwi?

What Can the LeKiwi Robot Do?

System Architecture

Hardware

AI Pipeline

Why Far-Field Voice Pickup Matters

Real-World Robotics Challenges

More Than Just a Robot Demo

Start Building Your Own Voice Robot

About Author

Elena Tang

2 thoughts on “Add Voice Interaction to LeKiwi Robot with reSpeaker Flex”

Leave a Reply Cancel reply

Calendar

Categories

Recent Posts

Newsletter from Seeedstudio

Seeed Fusion Open Parts Library for PCBA

Follow Us