I am Joshua Montgomery, Founder and CEO of Mycroft AI, here to answer your questions.

Mycroft is a privacy-focused, open source, AI voice assistant.

We have been recognized as the private voice assistant by groups such as Ubuntu, Mozilla, KDE, and Jaguar Land Rover.

Ask me anything!

For more info visit: https://mycroft.ai/

Proof: https://twitter.com/mycroft_ai/status/1060171460994650116

Edit: Unfortunately, it's that time. Thanks for all of the great questions. I really enjoyed this. I will definitely be doing this again!

Edit#2: Mycroft Chief Technology Officer, Steve Penrod (SteveP_MycroftAI), will continue to answer technical questions! Thanks for everything Reddit!

Comments: 649 • Responses: 39  • Date: 

swgmuffin681 karma

Why is the name Mycroft? The only thing that comes to mind is the Sherlock Holmes character.

oojoshua316 karma

Excellent question. We wrote a whole blog post about it: https://mycroft.ai/blog/why-name-it-mycroft/

anotherrustypic322 karma

If I am not wrong, AI works by learning from data. How does this go hand in hand with your privacy focus?

oojoshua612 karma

By default we keep people's data private. To improve our machine learning models we only use data from people who explicitly opt-in, this represents about 15% of our customers.

We've also developed a way for people to be forgotten if they choose to opt-out later. Data from people who opt out later on is removed from the data set. Users of the data are required to re-fresh it every 30 days so within a month of opting out the data is gone.

15% is actually quite a lot of data and has helped us improve the models significantly. The reality is that if open source is going to remain relevant in the future we need to develop and implement models like the Mycroft model to allow the community to build competitive technology. We think we've done a good job balancing privacy with machine learning, but remain open to suggestions about how we can improve it.

anotherrustypic118 karma

I was about to follow up with whether or not 15% data would be enough to build a stable machine, but you believe it is, so my question stands answered.

So are you planning on going public, head to head with the Siris, Cortanas, and Alexas?

oojoshua196 karma

We are public! ( https://startengine.com/mycroft-ai ) or at lease non-accredited investors can participate.

We are in the same category as the other solutions, but with a different focus. They are building intelligent assistants, we are building a AI agent. It seems pedantic, I know, but having a larger goal does make us a different beast. I think Google Assistant is the technology that has a similar goal to our own.

We're differentiated by our focus on privacy and our willingness to move the Mycroft back-end into our customer's security perimeter or cloud. This allows them to customize it, change it and make it their own. It remains to be seen if this is a good strategy. I think we'll know a lot more by the end of 2019.

CosmicSpittle206 karma

At one point in Mycroft development you wrote a blog about eventually getting to a point where an internet connection would not be necessary. For those of us that would like to use Mycroft in our own private Idaho, when do you see this being realistic?

oojoshua271 karma

Yes, though it may require you to run a server instance on another computer in your home. We may also ship a speech to text model that you train yourself or that is trained on a much smaller subset of your language.

This is coming, but probably not in 2019. Look for it in 2020 at the earliest.

KingOfTerrible114 karma

Usability and feature wise, how does Mycroft honestly compare to Amazon Echo or Google Home?

oojoshua221 karma

We have a nice blog post on this where we benchmarked our stack against both of them ( https://mycroft.ai/blog/the-mycroft-benchmark/ ) the answer was - we were tested and found wanting. This was great! This was our first benchmark and gives us a baseline to compare against. The team is now working to improve our performance and bring it up to par.

Our first production version will be released in February 2019. At that time I expect you'll see similar performance across the top 10 skills. We're certainly working hard to make that happen.

RaiderHawk7575 karma

How much can you bench press?

oojoshua135 karma

I usually start at 135 because my ego won't let me put anything lower than a 45 on the bar. In reality I am getting old enough that I should be pushing 125 or 100 with a lot more reps. I wish I had more time to work out!

BarelyLegalAlien69 karma

How often do people misread Mycroft for Microsoft?

oojoshua115 karma

Are you from the Mexican Trademark Office? They had the same question. Not often.

MylesMDT69 karma

How soon will Mycroft start throwing rocks?

oojoshua103 karma

16 October 2076 if we go by the book. Will need to get a Mark IV on the moon. Maybe Elon can help?

seanmercher55 karma

Any plans for a future model with video chatting capabilities like the Lenovo Smart Display (via Google Duo), Facebook Portal (via FB Messenger), Amazon Alexa Show, and others have?

oojoshua64 karma

Yes! That is on the roadmap some time in the future. The Mark II has the required equipment and the Mark III will have it as well. I would love to see a community contribution for that skill because we simply don't have the man power right now to do it in house.

We've been giving a lot of thought to how we reward significant community contributions like this and are likely going to start sharing our recurring revenue based on skill usage, but we haven't worked out a detailed plan for it yet. Keep an eye out for a blog post on this subject.

osominer37 karma

What languages do you support?

oojoshua78 karma

We currently support English, but are making great strides toward other languages. The Speech-To-Text team is working closely with Mozilla on the DeepSpeech engine which is gradually moving toward production. The top languages for DeepSpeech are Catalan, German, French and English. People interested in helping can visit the Common Voice page and contribute in their language at: https://voice.mozilla.org

On the Text-To-Speech side we've seen excellent progress in German and I've been on the road building interest in other languages including Icelandic, French, Spanish, Catalan and Portuguese.

The team is also making good progress translating all of the natural language understanding syntax at https://translate.mycroft.ai. Several languages are nearly complete including German and Russian.

Eventually the goal is to support any language with a population that is interested in having a voice assistant. We're working to tie all of the tools together to make it easy for people to contribute. When that is done we'll work to empower our community to participate by providing encouragement and running some simple contests.

Akay-Rawat33 karma

What's the price?

oojoshua85 karma

Free and open source ( Picroft https://mycroft.ai/documentation/picroft/ )

We also sell smart speakers. The Mark II is currently for sale on Indiegogo at $189 https://www.indiegogo.com/projects/mycroft-mark-ii-the-open-voice-assistant#/

got-it-man31 karma

Hi Joshua. Thank you very much for doing this AMA. I recently found Mycroft AI and played a bit with it. I have some questions:

  1. Why does the mycroft-core software have problems with finding the timezone for Germany?
  2. What is planned for the future e.g. in terms of skills (generally software updates)?
  3. Will the default voice be improved to be less robotic? I sometimes have a bit trouble understanding.
  4. How is the company Mycroft AI financially positioned? Even if the core product is open-source I am a bit scared that it might be hard to be able to compete against huge companys like Amazon and Google and still be able to pay the bills.

Thank you very much for answering my questions.

oojoshua66 karma

Why does the mycroft-core software have problems with finding the timezone for Germany?

I'm tempted to insert the old joke about the German, the sheep and the airplane here, but the answer I have is: I don't know. If you have a repeatable bug I'd encourage you to submit it to [[email protected]](mailto:[email protected]) and we'll look into it.

What is planned for the future e.g. in terms of skills (generally software updates)?

We push updated every 2 weeks and will continue to do so. We only make breaking changes when we do a major release. These take place in February and August. Overall you can expect the experience to improve remarkably between now and February 2019 when we release to production. From that point forward you can expect things to improve every two weeks.

Will the default voice be improved to be less robotic? I sometimes have a bit trouble understanding.

Yes. The Kusal voice was our first pass at a machine learning generated voice. I expect the team to have better voices available for the Feb 2019 release.

How is the company Mycroft AI financially positioned? Even if the core product is open-source I am a bit scared that it might be hard to be able to compete against huge companys like Amazon and Google and still be able to pay the bills.

Amazon and Google's interest in this market just proves how massive it will be. There is strong demand for our product and when we close our Series A in December we'll have 18-24 months of runway ( depending on sales ). Personally I'm not worried. This is a huge market, we've got a great product in development and Amazon and Google can only compete in so many places. Their are a lot of markets that they aren't interested in or can't access for regulatory reasons. As far as I'm concerned those are our markets to lose. To win them, however, we need to get more languages into production and fix our user experience. To do that we need more people, so......we're hiring!

malevolo27 karma

Another concerned by the language theme.

Open source software always had a great support for all languages... even better than closed sourced solutions. The open source voice assistant must be excel on it as well. So, as translate for example libreoffice is quite trivial, how do you plan people to help to translate "natural voices" and "natural expressions"?

I mean, skill developers hardcode their sentences into de the vocab and dialog files, but they don't follow (because there aren't) any good practice like parametrize the verb, the nouns, the adjectives, and so on, so community members can translate into their languages with a bit more context, for example, they should put something like "verb['ruin'] or noun['ruin']" and translators would have pretty quite easy to translate each word.

Genres and number are a problem in a lot of languages, and if the assistant must speak "naturally" it needs to know somehow if it speaking with a man or a woman to answer back correctly, or if I ask a question regarding a woman, Mycroft should be aware of her genre and finish the words accordingly (you must finish the word with a vowel, being "a" female and "o" male, leaving aside all the exceptions, of course)

Bilingualism is even on your radar? It would be *awesome* mycroft could understand at least a pair of languages and let people speak both of them and answer them in the corresponding one. In countries with two or more official languages, peoples usually tends to mix both languages in the same phrase, so if Mycroft could extract the context on a mixed phrase, I think it would be a killing feature no other assistant have (and hardly have never).

I know there are new tools for people to contribute (Persona, Rceording Studio) , not yet launched. Any ETA? They will be multi-language from the start? I know Mycroft multi language support is not yet ready, but contributions in all languages can be stored, because the data can be contributed from even before the support.

oojoshua49 karma

All of these are excellent comments and critiques. The real answer is - this is very hard. To succeed we need new ideas, new implementations and fresh points of view. Today we've got a lot of tools scattered about that make it difficult for people to contribute. They have to go to https://translate.mycroft.ai to translate prompts, https://voice.mozilla.org to contribute and tag samples, https://home.mycroft.ai to tag Mycroft DeepSpeech submissions and to tag precise wakeword samples. Finally we have two new tools - Persona and Recording Studio - that aren't even launched. Even I have trouble keeping up with everything.

I have been advocating for Chris Veilleux to spend some time unifying these tools into one experience, but right now he's wrapping up the skills store and I don't expect to see any significant progress before the end of the year.

The one thing we need to do to improve the speed of Mycroft is......hire........we need to fill 11 positions right now and may have another 15 to 20 in Q1 2019 ( depending on a customer proposal we sent last week ). With some additional developers we'll be able to move faster so if you know anyone who might be a fit......send them our way!

yyjd24 karma

Hi Joshua, thank you for doing this ama. I wanted to ask if you guys have any plans for federation between individual devices in a home, or possibly a larger scale form of distributed computing?

oojoshua30 karma

Yes. At a minimum we want to make the intercom feature work ( as shown in our 2015 Kickstarter video ) and sync music playback across devices. We may also be able to move the speech-to-text software to a high end device like the Mark II and use it to provide on site transcription to upcoming Mark I revision.

We may also be able to support larger scale clustering for specific purposes, but don't have any plans yet. It would be awesome to see a community [email protected] type skill that puts the FPGA chips in the Mark II to use! Global parallel computing is pretty cool stuff!

danijami2315 karma

How long before you turn on your people for dollars like every other privacy focused company? Joking aside, What's the accountability to stop you from doing just that when you have thousands of customers data?

oojoshua42 karma

Among other things we're converting into a B-Corporation at the end of the year. https://bcorporation.net/ As part of this process we are putting in place a series of corporate governance rules that prevent misbehavior.

That and we have integrity. When I look at other CEOs and technology leaders the ones I feel the most kinship with are guys like Jimmy Wales, not Mark Z.

chrisff198914 karma

I want something that can freely open, close and control programs or can be taught how. Obviously without having to touch the keyboard or mouse. Can this do that?

oojoshua24 karma

Here is a community generated demo on Ubuntu: https://www.youtube.com/watch?v=5gCNG4RaLj8

Itilvte9 karma

Can a single mycroft installation receive audio from multiple sources, e.g. from 1) local microphone and 2) audio over ethernet, and be configured to respond through the appropriate channel?

oojoshua29 karma

Sure, but you'll need to hack around with the message bus to make it work. Docs are here: https://mycroft.ai/documentation/

Sounds like a good weekend project.

TomTom_Attack9 karma

I forgot I even backed this... are you guys on track to deliver these anytime soon?

oojoshua23 karma

Barring any unforeseen delays the first batch will ship in December.

takjek9 karma

What is the current progress on the Mark II and by when do you expect the delivery? There were blog posts about the progress in the past but I haven't seen new ones recently.

oojoshua19 karma

The Mark II is coming along. We selected our CM last month and I wired the first 1/2 of the manufacturing fees two weeks ago. They have produced 10x boards as a test run and are working to be "pencils down" on plastics by November 12. I expect we'll receive and ship the first batch of bulk manufactured Mark II units in December with shipping to continue into Q1 2019. I'm still unsure as to how large the batches will be, but should have clarity in the next couple of weeks. The team has been asked to put up an update on Kickstarter and Indiegogo in the next week.

monadoboyX9 karma

What do you think the future applications for voice AI are?

oojoshua32 karma

Call centers, personal call screener, infinitely patient teacher, robotics control, vehicle control, front line customer service, cockpit assistant ( aircraft ), cockpit assistant ( race cars ), scuba diving assistant ( seriously ), hotel concierge, cruise shop concierge, real time simultaneous translation, hospital assistant......and on and on.

bradgy8 karma

Can you explain the rationale for having the male voice as the default and placing the female voice behind a paywall?

I know there is an ongoing debate over ai assistants and how most of them seem to have female personas by default (Alexa, Siri etc), perpetuating a women in servitude stereotype. For some reason (and I am not entirely sure I can even articulate my thinking on it right now) it seems unfair to me that one gender's voice is unavailable unless you pay extra. It would be great if you could give your thoughts on this, Cheers.

oojoshua18 karma

We're not too far away from having dozens of voices in different genders, languages and accents. We put the voice behind the paywall because we need to add SOME value for the $2 people pay us. In the future this may be a partnership with a music streaming service, but for now we put the Amy voice behind it ( which is awful by the way - we're working on it ).

Mycroft uses a male voice because I have two daughters and a lovely wife and don't like the idea of training a generation of young men to demand instant service from a technology perceived as female. Also, Mycroft in the book presents as male (though we learn early on that the AI is also comfortable as a female "Michele" with a french accent).

Interestingly, to be faithful to the narrative, we should actually use an Aussie accent, but we aren't quite there yet.

As we add more voices and create partnerships with music services and video services that add more value to the stack I expect we'll move a pair of voices ( male and female ) outside the paywall for each language.

10waf7 karma

What technical differences can we expect between the Mark II and what you call the production release in Feb '19?

oojoshua18 karma

The production software will power the Mark II. The device is being shipped with beta software, but will update automatically when new software is available. When we ship 19.02 all of the units that are online will automagically upgrade themselves.

I love the 21st century. You buy stuff and instead of depreciating and becoming useless, it gets better over time.

Reminds me of one of my favorite sci-fi ideas ( not my favorite novel, but a novel with a pretty cool concept behind it ) David Brin's 1984 "The Practice Effect" which posits a universe where entropy is backwards. https://en.wikipedia.org/wiki/The_Practice_Effect

Crystalball20207 karma

How large can Mycroft become in light of the market domination of amazon? In light of the red hat acquisition, how valuable do you think Mycroft can become in 5 years?

oojoshua18 karma

Mycroft has the potential to be a significant player in this space. All of the other technologies have ancillary businesses to promote or protect. This makes it hard for them to focus on building the best voice technologies. Instead they are looking to voice to promote their other businesses - hardware sales ( Apple ), online retail ( Amazon ), search ( Google ).

By focusing on building a great voice assistant and working toward a common goal - a voice assistant that runs anywhere and interacts exactly like a person - our community can succeed in a significant way. There are tons of big corporations out there that want to deploy voice technology, but don't want to do business with Apple, Amazon or Google.

In terms of how valuable we can be......if we achieve our goal and the Mycroft agent is able to interact naturally across a broad range of applications ( call centers, booking appointments, front line customer service, etc. ) we will be on the path to financial success. What does that look like in numbers? I don't know, but we took it from $0 to $36.8M in 3 years and I think we can increase the rate of change. We'll see what happens.

takjek7 karma

What is the current progress on conversations? I assume this is a very difficult concept so how are you planning to approach it?

oojoshua17 karma

We built a tool called Persona to handle this. Missed intents and conversational gambits are fed into it and our community will soon be allowed to resolve these queries. The resulting text is fed into a Machine Learning algorithm that is then responsible for responding to missed intents and engaging in conversation. I wrote a pretty extensive blog post on how this is intended to work: https://mycroft.ai/blog/building-strong-ai-strategy/

That blog post is basically a step by step procedure for building an AI that can beat Turing.

bulgarianBarbarian7 karma

Thank you for doing an AMA! My understanding is that the iPhone because a Trojan horse for Apple in the workplace because users loved it so much, they demanded jobs make it work for work emails and this led to more Macbooks, iPads, etc. Do you see having to overcome similar challenges against an established AI like Alexa because that is what customers/companies know of? As an example, if Jaguar puts Mycroft in their cars, how do you avoid customers rejecting it?

oojoshua30 karma

The problem with the established players in this market is privacy and user agency. Alexa sends all audio to Amazon, makes none available to skill developers and represents Amazon's best interests, not yours. That means companies are reluctant to make use of it and those who do may come to regret it: https://continuations.com/post/152725714455/voice-platforms-open-alternative-is-an

The answer to how we gain acceptance......we need to not suck. The user experience needs to be on par with Alexa or Assistant. As I pointed out above, right now we aren't there, but we are working toward it and I expect a pretty solid experience when we release to production in February. Customers will like it and use it if it works well and is friction free. If it sucks......then we end up an "also ran".

PootsForJesus7 karma

How are you different in terms of, say, Snips?

oojoshua22 karma

We provide a full voice stack ( wakeword, stt, nlu, core, tts ), ship a smart speaker and are fully open source. Snips is a great company and they are building a great voice command system, but we are building a Strong AI. That is a very different mission statement.

Also: we're not subsidized by our government :)

takjek6 karma

What are the biggest challenges that you see in the coming year?

oojoshua23 karma

Hiring. Seriously, we need to hire a ton of people and that means finding them, getting them to quit good jobs ( the best people already have jobs mostly ) and integrating them into our team.

I also see distribution as a problem to solve. I've spent some time in Bentonville recently and feel good about opportunities in Arkansas, but we need to distribute globally. How do we get that going? How do we support people in Fiji and Iceland and Sao Tome? These are all tough problems.

I also see completion bias as an enemy that we need to conquer. Right now we need to focus on the top 10 skills that people use. Timers are boring, but they are one of the most commonly used applications. We need to get the user experience right for timers before launching into video chat. We need to stay focused and with so many great ideas coming from the open source community that can be difficult.

snowlovesnow5 karma

What car do you drive?

oojoshua14 karma

I drive a white 2012 Chevy Volt. I really love it. With the exception of the limited visibility due to the high windows it is a pretty nice little car.

riaKoob14 karma

Hello! I'm thinking of jumping into AI, machine learning. I have a CS degree, but I'm not sure where to start to switch to this field. What are your suggestions?

oojoshua23 karma

You should do that, learn how to use ML and come work for us. We have 2x ML positions open right now.

That aside, I'm a strong believer in project based learning. To familiarize myself with the technology I build an image classifier that could sort american coins.

If you're looking for a project I'd suggest a classifier that not only sorts coins, but reads the date and mint from the coin, then flags if it is worth more than face value. Add a mechanical sorting mechanism to it and then dump in buckets of loose change to get the silver dimes, quarters and any rare coins out of it. Would be fun and profitable.

banannapandas4 karma

How in control are users in actuality? Many tech companies boast how users are in complete control of the product but that isn't the reality. I have little experience coding so how would that translate specifically for a product like Mark II? Thanks! I'm really excited for something open like Mycroft to compete in the market :)

oojoshua14 karma

Our customers have complete visibility of the source code and the hardware schematics all the way down to the chip level. Anyone can change any part of the software and use it as they see fit.

Right now the only part of our stack not released to the public is our back-end system that runs home.mycroft.ai and the skills store. That code will be released under an AGPL license ( or similar ) some time next year. We'd release it sooner, but need to do a thorough security review before making it public.

Isoneguy4 karma

Would you like a refreshing beverage?

oojoshua9 karma

A Diet Mountain Dew would be fantastic, but I'm in Europe where, sadly, Diet Mountain Dew isn't a thing. I brought a 12 pack with me on Monday and am down to 6.

takjek3 karma

Do you plan on adding some form of speaker detection? E.g. if I ask something, I might get a different answer than if my girlfriend asks something.

oojoshua17 karma

Yes. We are working with a partner who can provide banking level secure identification, but no contracts have been signed. If we don't get it done with a partner I expect we'll do this work in-house. The schedule will then be driven by personnel ( we're hiring! ) or.....and this would be awesome.....someone in our community might solve this problem and donate the solution. Has happened in the past.....https://www.youtube.com/watch?v=ytKUTBfjnQI

jarobat3 karma

How involved are you with the astromechs?

oojoshua7 karma

We haven't done a lot of work with them. I had some conversations with the KC chapter early on, but we were a bit early. I'd love to see an Astromech running Mycroft as a control agent. "R2, come over here!"

takjek2 karma

If I understand it correctly, Deepspeech is using both the opt-in data from Mycroft and the data from common voice to improve the neural net. To some extend, this limits the data that is truly open source. Would there be an option to add a third option to select some of your phrases to be published in the public domain? In that sense, everybody interested in voice data could benefit.

On the same topic, as you mentioned previously, data can dissappear if a user's cancels the opt-in. Could it therefore be that the Deepspeech quality fluctuates month over month?

oojoshua13 karma

We will publish our data as an open data set, but people using it have to agree to refresh it every 30 days so that people can be forgotten. We'll probably need to vet these individuals / organizations as well to prevent bad behavior. We don't have any cryptographic way to enforce the 30 day refresh period so we have to do it with contracts. Contracts are only as good as the counter party.

Can the quality fluctuate? Yes, a new model trained with different data or trained in a different way may be less accurate than an existing model. In that case.....use the existing model. You do bring up an important point however - people do have the right to be forgotten. That means if we misbehave our community can all demand to be forgotten and basically nuke the voice data. That is a pretty powerful motive for us to not be d-bags.

Lighting2 karma

Do you have consumer privacy protection built into your system if purchased by another company?

oojoshua12 karma

The Mycroft agent is being developed to be private by design. We have some work to do on the crypto side, but I'm happy with our progress relative to resources as of now.

If another company uses our software to service their customers we don't have any input into the decision. So a Mycroft designed speaker being sold under a different brand using a private instance of our back-end would follow the policies of the company who's brand is on the box. My guess is that most of these companies will want to better understand their customers so I'd expect their privacy policy to diverge from ours.

If you want a privacy focused device buy one with our logo on it.

Lighting5 karma

Interesting I'll check out your system. Incidentally there's a 404 link to List of Community-Contributed Skills from https://mycroft.ai/get-started/

I guess I didn't phrase my question well. Sorry. Let me restate: If your entire company were purchased by IBM or Facebook or Binladin Group or ... whatever, do you have consumer privacy protection built into your back end system ?

oojoshua9 karma

My guess is that the code would fork, we'd offer the community the ability to download their data and settings ( and delete them ), then a community or company would stand up to host a private version. Open source is awesome that way.

If we do get acquired this is going to be a point of discussion and I plan to advocate for transparent treatment of our customers. I've been paying close attention to what happened to the founders of Pinterest and WhatsApp and won't climb into their shoes willingly.

somebadmeme2 karma

Favourite meal?

oojoshua2 karma

Chicken Baked Sam, Corn on the Cobb, Strawberry Salad and Lemonade.

takjek1 karma

What did you learn from the recent switch to more languages?

oojoshua5 karma

Languages are hard, but the machine learning approach is portable between languages. I've also discovered that people around the world are basically the same. They want the same things: safe streets, good schools, economic security, etc. and their desires extend to technology. I personally would be uncomfortable with a smart speaker that only speaks Spanish because I don't speak Spanish in my home ( or very well ). People in Spain want a voice assistant that speaks Spanish. In Barcelona they want one that speaks Catalan. We are working to make that happen, but have a long way to go before we support the thousands of languages spoken around the world.