Senior Voice Engineer
About TechSee:
TechSee is the global leader in AI-powered visual assistance, helping the world’s largest service providers transform how customers and technicians solve complex device and connectivity issues. Our Visual AI platform - trusted by Vodafone, Orange, Hitachi and dozens of Fortune 500 enterprises - combines computer vision, augmented reality, and conversational AI to resolve millions of support interactions every year. TechSee is backed by Salesforce Ventures, Telus, Scale Venture Partners, and OurCrowd.
The opportunity:
Voice is how most people will first reach our new AI assistant. They will pick up the phone, open a mobile app, or tap a voice button, and within seconds expect a real conversation that understands them, their home, and their problem.
As Senior Voice Engineer, you will own that experience end-to-end - from the frontend UI decisions, through the realtime backend, into the agent that actually solves the problem. You will build a voice stack that feels human, runs at consumer scale, and bridges seamlessly into visual and chat channels when the conversation needs to evolve.
Key Responsibilities:
Own the Voice Channel End-to-End
Architect and build the realtime voice pipeline: STT, TTS, turn-taking, VAD, barge-in, latency tuning.
Integrate with telephony, mobile SDKs, and messaging channels (WhatsApp, iMessage, in-app voice).
Design seamless multi-modal handoffs - from voice into camera-based visual sessions and back.
Full-Stack Delivery
Ship the mobile and web frontends that capture and render the voice experience.
Build the backend services that drive realtime audio, session state, and integration with our agent platform.
Make hard calls on protocols, codecs, streaming strategies, and provider trade-offs.
Make It Feel Human
Obsess over latency, prosody, interruption handling, and recovery from misrecognition.
Build instrumentation that surfaces voice quality issues before users complain.
Partner with AI engineers to make sure the agent behind the voice is actually conversational, not transactional.
Set the Bar
Define the team’s standards for realtime systems, audio quality, and observability.
Mentor engineers across frontend and backend on voice-first thinking.
Qualifications:
Senior-level experience building voice-based products in the conversational agent space.
Strong background across both frontend (mobile and/or web) and backend.
Hands-on experience with realtime audio systems - STT/TTS providers, streaming, telephony, or equivalent stacks.
Solid grasp of conversational design pitfalls and how to engineer around them.
B.Sc. or higher in Computer Science or a related field.
Advantage
Track record on large-scale production systems.
Experience with AWS or other major cloud platforms.
Background with LLM-driven agents, MCP/A2A, or multi-agent orchestration.
Why Work With Us?
At TechSee, we combine cutting-edge innovation with a people-first philosophy. We are looking for high-performers who are driven by excellence, collaboration, and the desire to make a tangible impact on the future of AI.
A voice product that matters. Millions of consumers, real problems, no scripts - your work will be heard, literally.
Greenfield stack. Pick the right tools, set the right patterns, build it the way it should be built.
Cross-disciplinary team. Sit between AI, mobile, infra, and domain experts who actually know the field.
Autonomy & Bold Execution: We value individuals who are proactive and results-oriented.
Hybrid by design. Herzliya office, flexible remote days, ownership over your craft.
- Department
- R&D
- Locations
- Herzliya, Israel.
- Remote status
- Hybrid