Sometimes you have to stick with “just good enough”

Published in product, code on January 26th, 2024.

You don’t always need the shiniest or newest piece of tech to get things done.

Just as you can use various services or packages out there to skip the line and not have to reinvent the wheel when building products, the opposite equally applies.

And it’s soo important to know when to choose one or the other approach.

Some time ago, we started seeing the quality of the results a search module returned in one of our apps degrade. The function allowed you to look for a specific record across multiple models with similar attributes, like a person’s name.

This worked great when we first launched the platform and the data was, to some extent, more distinct.

Once we reached a few hundred thousand data points, similarities began to appear, and the search module no longer returned the specific results we were looking for. Actually, it returned too many records and thus a lot of noise.

Some background information: when implementing a basic search engine, the approach is to query the database for occurrences of a given string in one or more columns. Any match of your conditions is part of the results provided back by the system.

For instance, once your database grows to include numerous entries like Eliza, Liz, and Elizabeth, the search results for "Liz" can become overwhelming with such a simple logic. Add a last name or an e-mail, since we had a multi-term search, and it’s gonna get frustrating. It sure did for us and also for our users.

So we discussed this topic.

The most obvious solution was to resort to technologies like Algolia or Elasticsearch. It was also the one that was raising a smile and getting the “ooh, let’s do it” among technical roles.

With the right amount of time, we could’ve integrated any of these two into the search module to provide more meaningful results to queries.

We didn't have enough time available, though, as we were juggling with multiple other topics of a rather higher priority, so we were facing 2 choices:

  1. postpone this in full or

  2. stay flexible and take a shot at getting a “this is good enough”.

Algolia would’ve been pretty easy to set up from a technical perspective, but presented a legal challenge: we had to be mindful of the consent received and remain compliant with the local laws. This approach would have required us to send a vast amount of personal information (PII) to the vendor's servers for indexing. That’s not a straightforward task from a privacy standpoint. Doable, but in a couple of weeks at least and with the involvement of multiple other areas. It was a totally different scenario compared to using Algolia for a knowledge base or products in an e-commerce store.

Elasticsearch, being self-hostable and not raising legal concerns, still required significant time investment due to its complexity, integration needs, and the requirement for devops.

The actual solution?

We realized that the core problem was not that our search wasn’t including the results we needed, but that it did not prioritized the most meaningful ones.

It took us a few hours in total to modify the algorithm and rank the results based on a weight every searched keyword had in the end results. And this still presented a nice engineering challenge onto the table.

Suddenly, the exact terms you were searching for were ranked first in the results. With that small tweak.

You searched for Liz? That’s what you got first.

Users were happy and most of them did not knew, or even cared, what’s actually behind. We just got the job done.

Was it the perfect solution? At that moment for us, yes. From an engineering standpoint? Well, Algolia is an entire business centered around this topic and is in a different league compared to our 30 lines of code tackling this issue. Elasticsearch benefits from a vast community with years of compounded experience.

But. Until the next challenge appears with the search, this solution will be just good enough.