Google is search. When you think of search you think of Google’s search. They’ve redefined what we think of searching – before Google we were pretty happy with exact text matches and guessing at what we wanted. Google changed that – you don’t need to spell things right, all of your past searches are considered when relating things, you can ask it just about any oddball question and get an answer back. Google does search better than anyone else on the planet.
We generally expect search anywhere else to be somewhat like google – it’s the gold standard. But doing searching well is hard – it takes a lot of computations and piles of data to do it even ‘alright’ in comparison to google.
Last week we made a dramatic improvement to our searching. Sasha wrote a great post about it. It’s not Google good but I think it’s better than any of our competitors by a mile. Let’s talk about some of the technical details.
The most basic search is to see if what you’re looking for matches up with existing data.
Say you’re looking for a customer with the name of ‘John Doe’ – a basic search for that would be to look through our customer table for anyone with a name that exactly matches ‘John Doe’. That’s an exact search. You know the exact match you’re looking for and the place that piece of data is stored.
It’s also the least useful kind of search – because sure, you might know some customer names specifically – but what happens when someone else entered the customer name and misspelled it or you typed it like it sounded to you and it’s spelled much differently?
You need to do partial matching to make that better. If you start typing ‘John Doe’ and get to ‘Joh’ it’s helpful to see that you’re whittling down your results – you get to ‘John D’ and you probably see the person you’re looking for – that’s a lot better than needing to remember how to spell a complicated last name.
More often you have multiple places for a phone number or address information to be stored. You know the address you want to look up but you don’t know if it’s a device address or a customers mailing address; you don’t want to search twice!
Enter searching across multiple elements – you want to be able to look for not only a customers name, but also their phone numbers, email addresses, addresses and a whole slew of other components.
Doing that in a relational database (what we use for much of our data storage) can get to be pretty slow. Relational databases are meant to lookup information very quickly when you know information about the record. Records are stored as rows in tables and related to each other. Finding a specific row (a customer) can be *very* quick. But searching for a customers last name but not their first name might mean doing some relatively less efficient searching within a field. Now do that across 30-50 fields and you quickly find out that your search isn’t fast enough.
Not fast enough is bad!
Enter systems built specifically for searching: We’re using the most popular one around called Elasticsearch.
Elasticsearch to the rescue!
Elasticsearch was built from the ground up to be ridiculously fast at searching disparate data – phone numbers, email, devices, customers, it doesn’t matter – Elasticsearch will scour it all much faster than any relational database and give you exactly the result you’re looking for.
Another storage mechanism means we need to synchronize it with our existing database. Every time a record in our relational database is updated – ie you update a customer or schedule a test we update our Elasticsearch database. This means your search will always be up to date with any changes that you or anyone else in your company make.
Even neater – we’re only using a small portion of what Elasticsearch offers us. Now that we have the basic functionality implemented we can use it to do so much more.
How can we make our search more useful for your company? Let us know!