Building a search engine from scratch is quite a challenge and so is competing in a market significantly dominated by a single player. But we take pride in being a completely independent search engine, reliant on no one else and driven by our own values. We have already built one the biggest web search indexes in the world and was the first search engine ever to have a no tracking privacy policy. Also it’s becoming more and more apparent that our access to information has become a critical part of our every day lives. Therefore, we believe for something as important and fundamental to the web as search, there should be greater choice. This is why we’re building Mojeek.

We would love for you to try searching with Mojeek and ask us whatever questions you have – https://www.mojeek.com

Mojeek was built by founder and developer Marc Smith using the C programming language. It begun as a personal project, with servers running out of his bedroom, but has since grown to more than 100 servers running out of the UK’s greenest data centre, Custodian. Founder Marc (u/marcls) and Head of Marketing Finn (u/FjjB) will be answering your questions.

Proof: https://twitter.com/mojeek/status/1094180607578513409

Finally, if you want to hang about after the AMA is finished, please feel free to head over to our subreddit (https://www.reddit.com/r/mojeek) where we can continue the conversation, or even subscribe to support Mojeek in building the world’s alternative search engine.

Edit: Thanks for your questions so far, we’re going to take a break for the night and continue in the morning. Speak to you then.

Edit: 13/02/19 Thank you very much everyone for your questions and feedback. We’ve been blown away with the sheer number of responses and now have a lot of food for thought. We need to take another break for the night but will most definitely be back tomorrow to answer any questions and get back to people we have yet to do so, and we apologise if you’ve been waiting a while for a response.

Edit: Thank you everyone for taking part in the AMA. It’s been incredibly worthwhile to us and has provided an unbelievable amount of useful feedback to take on board. We tried our best to get back to every question, so we apologise if yours slipped through the net. If so, feel free to send us an email or continue the discussion on our subreddit: https://www.reddit.com/r/mojeek

We will be going through all the comments again to take notes and see if there’s any that still require a response. We’ve learned an incredible amount, so thank you again for taking the time to comment.

Comments: 992 • Responses: 24  • Date: 

ImpossibleBridge1080 karma

How is it better/different than Duck Duck Go?

FjjB868 karma

The main difference is Mojeek is a crawler-based search engine, whereas DuckDuckGo is a metasearch engine. Crawler search engines build their own index of web pages, whereas metasearch engines build on top of other crawlers, like DuckDuckGo whose organic results primarily come from Bing.

We believe having full control of our own algorithm is also what makes us potentially better than DDG.

JBinero192 karma

What does unbiased mean in the context of a search engine? How are other search engines biased? Is bias the same as personalised results - and then, isn't that also a disadvantage?

FjjB82 karma

Bias in search engines can mean many things, from manipulation of the results, or preferential display for advertisers or a point of view. It could also include personalised results and we don‘t believe it‘s that much of a disadvantage, because most of the personalisation that occurs is for targeted advertising. Whereas, when you‘re on search engine your results are based on your keywords you type in, to a larger degree than your history. Who knows if other search engines are biased, some people believe so, and whether they are or not, it shows the importance that other choices exist.

icelock01360 karma

Are you using this AMA to test your bandwidth?

FjjB15 karma

No. It was for feedback and any ideas that may come from that.

dylmye34 karma

I did a couple of test searches and your site is pretty fast. I like the context cards (eg wiki definitions). However your ordering algorithm seems like it could do with a lot of improvement (I imagine that it's the most complex part of a search engine). How do you plan to improve the ordering algo?

FjjB5 karma

Our algorithm is highly configurable on a search by search basis, so we’re continuously improving it as well as the systems we use to test those improvements. At sometime in the future we may consider making some of these algorithm settings configurable via the UI.

Mouthmouthmouth29 karma

Could you explain the "emotions" search? Is the emotion supposed to be how a person will react to the results? Like, search "frowny face" to find things that will make you angry?

needsUnicorn26 karma

Not OP but try it out its clever. The results returned fit the emotion. I searched for 'dirtbag'. Angry face emoji returned stories of dirtbags going to prison, happy emoji was about loveable dirtbags and Avril Lavigne 'Teenage Dirtbag' concert tickets...etc. Nice work u/FjjB

Mouthmouthmouth14 karma

Oh, that is pretty neat! I searched "science fiction". The happy face gave me scifi-themed cartoons, the "wow face" gave me amazing science discoveries, etc. Cool stuff!

FjjB11 karma

Cheers, glad you like it!

comuloid26 karma

What do you have in place to prevent websites from using spammy techniques to manipulate your results?

How did you choose a name with a J which is not linked to any major brands and could result in failing to produce brand memorability throughout different languages? Name some big brands that use a J.

You mentioned you would consider advertising for funding. How are you going to provide advertising if you do not collect user data? No advertisers are going to want to pay to advertise to 100% of your audience.

How does your (and will your algorithm) compete with algorithms that have been in constant development for dozens of years?

FjjB8 karma

Fair point, we’ve never thought about the ‘J’ in the name as problem before. Some people seem to love the name and others not so. It’s also a matter of coming up with a new one that’s available, then we might consider it.

A major component of the search engine is to build algorithms that satisfy a user's query. Even Google's algorithm which is considered state of the art is constantly evolving and having to adapt to what user's are searching for.

User data is not a necessity for advertisers (and is also how DDG monetise). The fact that a user is entering a search query describing their intentions provides enough information to an advertiser about the intentions of the user. Having a browsing history of the user such as what Google would have for example, might be useful in further deciding a user's intention and for targeted advertising when on sites other than Google search, but this comes at the sacrifice of their privacy.

Our algorithm is designed to return the most relevant results without the need for personal user data, and has also been developed over many years. Now we just need to grow the index considerably.

OathOfFeanor23 karma

Could you throw some raw numbers at us? How large is the index in TB? How many pages are indexed vs. unindexed? How much bandwidth does the crawler burn per month?

FjjB14 karma

Our search index now contains 2.3 billion web pages and we aim on doubling this by the end of year, and again next. But I don’t have the other numbers at hand, so we’ll have to get back to you on this one, hopefully by tomorrow.

srhb12 karma

Why are your crawler and indexing sources not open? Do you not agree that closed source and trust is a hard sell, and, in the current media clime, justifiably so?

FjjB10 karma

We value open source software, and note the likes of Gigablast who have shared their code on Github.

Search engines are a curious beast where there needs to be an element of secrecy in order for results not to be manipulated en masse by those who wish to manipulate search engine results. It's a well known game of cat and mouse between search engines and search engine marketers. Sometimes this is known as SEO, on a more technical level it's been called "adverserial information retreival". See https://en.wikipedia.org/wiki/Adversarial_information_retrieval

While Mojeek prides itself on unbiased results, we also recognise that having transparent algorithms could lead to a degrading of results. It is a fine line to follow and one that we continually evaluate in order to satisfy all user's requirements.

YourDeformedGod11 karma

Would it be similar to the DuckDuckGo search engine?

FjjB14 karma

Yes in respect to privacy, but Mojeek is a crawler-based search engine and therefore we have our own search index and algorithm.

trackofalljades14 karma

So eventually DDG will probably just incorporate results from Mojeek as well, like it does other search engines, and I don’t need to change anything? Sweet. 😉

FjjB23 karma

Well yes, if they want to pay us for them. And then at least their back end search provider will be privacy focused as well and have similar values.

Rose_Beef4 karma

Without tracking, how are you going to monetize this? 100 servers isn't cheap.

FjjB5 karma

Having our own technology gives us plenty of options, including API access, licensing, advertising and more.

Darkrenga4 karma

When you say you will provide with unbiased results how will this effect the markup behind websites?

FjjB3 karma

This doesn‘t affect the markup behind any website. Providing unbiased results is just a value we hold strong.

Lane07202 karma

Do you have any plans on how to compete in a market that is so significantly dominated by a single player?

FjjB5 karma

Our plans right now are simple; build up our index with the aim of retaining more users. At the same time keeping our costs to a minimum so we can still be a viable and successful company without necessarily challenging Google directly (although that would be the ultimate aim). Also, more and more people are becoming aware of the importance of privacy and the need for an alternative, which is only helping us further.