Building a Search Engine
I was re-reading “I am feeling lucky” and this inspired me, so I have decided to create a search engine. This is a very ambitious project; however, it will teach me a lot about building a micro service with Kubernetes all from scratch.
My plan is to refactor the spider to use a relational database. I am learning towards each spider having its own SQLite db. Then create a service which can go to the spiders and get their current crawl and add them to a central DB. This will create a single graph of all the visited notes.
In order to determine the weights of the graph, I will take a partial graph and traverse the map and count if there are linking nodes. This will help me build a prominence value for the pages. This will need to get a lot of websites, a lot since websites taken at random have an extremely low chance of being linked.
Next, I initially planned on using an Open search/Elastic search; however, due to traffic, I decided against it. It added a lot more complexity for little benefit. So instead, I will use a relational db fuzzy search and order by the weighting. It’s a very naive solution; however, I can change this at a later date. The front end would be a very simple React page with a debounce we can use to do some nice fast searching. Additionally, I can do some optimisations with the table using indexes to reduce the speed reduction for not using a detected Full-text search tool.
I like the challenge that this created and I think it will be fun, I will try and update this blog with how things are working out.