Always Find The Rabbit.
Ajay Juneja stashed this in Philosophy
The Eureka Moment
Thinking back in time, Voice User Interfaces have been one of the key areas of computer science that has always had the potential to move the human race forward… so what happens when… it’s your 21st birthday at 4am and you finally have a system that works the way we all wanted them to for your entire known life, since the days of Star Trek, Knight Rider, and 2001 a Space Odyssey?
December 5th, 2001, was that moment for us.
Randy Pausch’s lab was the one next door to the lab we were in at CMU, that of the Carnegie Mellon Interact Lab.
To no surprise, his lab was the only other one with people in it at that time in the morning.
I showed the system working with a demo that lasted for an hour.. and I have a giant 4 FT tall Scooby Doo commemorating that moment. It used up more than a gigabyte of ram back then and needed a $9000 dual processor computer to run, and crashed every hour, but it worked… the way a voice user interface should.
The age of context, begins here.
The uniqueness of this dialogue system, as we call it, is that it used Context, State, and Sentence Structure to reduce the search space of the speech recognizer. We looked at speech as a contextual sensitive search problem, whereas everyone else was looking at it as an audio and acoustics problem.
I walked back home to 5302 Beeler Street with a huge grin and Scooby in hand.
Seeing the Future.
On the floor below us, there was a car that could drive itself, called the NavLab; funded by Bosch; who was also one of the main funders of our research.
The NavLab was the predecessor to Red Team Racing (CMU), Stanley (Stanford). Our other funder was DARPA. In fact, the same DARPA grants were funding a team at SRI. Today you might know them as Siri. The NavLab project you might know now as the Google and Audi self driving cars.
Grad School or Start a Company?
Bosch graciously offered to fund a PhD (Thank you Jeffrey Donne and Jill Wooster for your help in making that happen), but I chose to start a company instead, Speak With Me.
Everyone will have a 1 GHZ smartphone by 2007, right?
My thinking (in 2003) was that everyone would have a gigahertz smartphone by 2007. Turns out, that was a few years too early, but I reasoned that the PhD was going to take too long (7-13 years) if we wanted to make our market window, it was better to get the core of the dialogue system in the hands of consumers, and then the stuff that was going to be in my PhD (a Situationally Adaptive Dialogue Manager for Driving Applications, aka let’s make KITT from Knight Rider for real) could be done later on after the core technology had some distribution.
Speak With Me, Inc.
So we negotiated a license with CMU and Speak With Me started in July 2005.
Fundamentally, we at Speak With Me made a different bet than the rest of the industry. I personally was hell bent on running things client-side because, for a voice interface, speed is one of the four main things for a system to feel human like. Today we have a system that runs majority client side and responds to your requests in a tenth of a second. It’s 50 times faster than Siri. And, it doesn’t need a network connection to work.
Our first project was the most fabulous Tom Tom GPS ever during 2007-2009, when Tom Tom was selling 12M units a year. It did way more than what Siri does in iOS 6, and the CTO of Tom Tom said we gave them a Rolls Royce, and all he wanted was a Fiat. I responded that consumers today want the Rolls Royce for the price of the Fiat. We were performing driver assistance tasks, like, “Take me to San Francisco using 380 to 280 to 101.” Today’s GPS systems STILL don’t do that.
Hitting The Wall
Two weeks after we finished that product, Google gave away navigation for free on Android and Tom Tom sales dropped 80% overnight (and their stock price dropped 25% in one day) and that product never shipped. Today, Tom Tom is the map supplier for Apple’s take on a product we had in mind to ship 3 years before them.
We had spent over a million dollars of our funding making that product, and I thought it would take Google at least 3 more years to map the whole planet. You know, cause mapping the whole planet is kinda a big task. Well, they actually hadn’t mapped the whole planet. Apparently they only had mapped the USA at that point, but that was enough to kill the $7 billion GPS market overnight.
Unfortunately, smartphones were still not yet fast enough to run a client side dialogue manager, the first one able to do so was the iPhone 4 which didn’t come out till June 2010.
Where’s the next rabbit?
In April 2010, we spent 6 hours with the Siri team on the day they were getting bought by Apple. Adam Cheyer, Tom Gruber, and I all felt the speed of the client side system was a big deal. And the ideal system was a hybrid client/server approach.
But none of us knew how to combine the two.
There he is. Nope, not there. Look over there.
This is where your passions are really helpful things. The answer to this question came from the world of music. When you paired signal-to-noise with speech recognition confidence scores from the on-device speech recognizer, we now had a low compute power way of determining what utterances to run device-side vs. server side.
This is one of our patents pending, and it came about through thinking about things from the world of music, rather than the world of computer science.
We’re working on finishing this next generation dialogue manager now, I call her Aria. It’s a play on our research name of Ariadne. It works both offline and online. And it helps you integrate context with your big data with your sensor data with your user interface, to power what Robert Scoble coins as The Age of Context, and Wearable Computing.
So here is what we learned along the way:
Always trust your gut instincts.
Your competitors are your best friends.
They know your space as well as you do, they struggle with the same things as you. Remember that there is plenty of pie for all of you to eat, and make sure to invite them to your parties, and go to theirs.
You each will learn to make better pie if you try one another’s recipes :)
Your true friends are also your best friends.
When your back is against the wall, move your arm around the hat until you find the next rabbit.
The rabbit isn’t always where you expect it to be.
Trust that there will always be a rabbit, if you are a good person and treat people kindly.
It takes a village to start a company
When you screw up, cash in your built up goodwill and faith and try again.
Remember your core motivations for why you got started down your path.
Be gracious and humble at all times.
Trust that when you give everyone in your life a hug, that the world will give you one back, as long as you keep giving hugs.
When in doubt, keep searching for the next rabbit. There is always another one.
Thank you to all of our investors, customers, advisors, staff, consultants, and all of our wonderful friends and competitors. We can’t wait for you to try our next pie recipe. It’s our best yet.
2 hours ago