Welcome to another round of Albus intelligence updates. In this update, I will talk about how our engineering team made big leaps to find you the right answers. Faster. And more often.
Let's start with a question: do you know what happens when you combine the top two LLMs in the world?
You should see what happens now that Albus uses Cohere!
Cohere helps him rerank sources and their corresponding information before writing an answer. This means that when you ask Albus a question, he goes deeper into your sources, fetches more than thrice the information as usual and comes back with a far more contextual answer.
How does that happen? Before Cohere, Albus was picking up the top 3 sources that matched your question to write an answer. This meant that if he found 10 relevant sources and your answer lay in the 7th source, he would ignore it.
With Cohere, he can now pick up to 10 sources.
He then reranks those sources based on which source might have the most relevant information. This means when Albus is writing an answer from the top 3 sources, he’s getting higher-quality information.
Here’s an example of how much his results have improved with Cohere and our chunk experiments:
This is a lot smarter, isn’t it? And quite costly too. But we are happy to take on this cost to provide you with a better experience in workplace search.
Moving on from Cohere, Albus has majorly begun doing 2 more essential things. Let’s start with metadata:
When you ask Albus to absorb information from a source which can be a Slack channel, a Notion page or a word document (anything), he starts learning and remembering everything in small chunks.
He now adds metadata information in each chunk. This includes things like the name of the source, the name of your company and, very soon, the name of your team.
Sounds pretty basic, doesn’t it? But trust me when I say we had to do a lot of experiments to find the sweet point where adding metadata does not dilute the quality of his answers.
Splitting information in chunks is like putting the right ingredients in the perfect proportions in a potion. Getting AI to work tailored for your use case isn’t easy. That’s why prompt engineering is a thing in the first place!
While we’re on the topic of chunks, here are 2 cents on chunk size.
Breaking down information in the right proportions is an art. How many tokens should we fit in a chunk - 512? 1024? How do we split information and feed metadata while ensuring we don’t exceed the chunk size constraints of our vector database?
There’s no clear answer. We have been experimenting this and will continue to. The only outcome we see is something new to learn and something new to pass on to you.