The Past, Present and Future of LlamaIndex

This is our first interview series and I couldn’t have asked for a better guest other than Jerry Liu, the creator of LlamaIndex. LlamaIndex is a framework that helps you connect Large Language Models with your own data. This is something that opens up a wide range of possibilities that was not possible before. Things like building customer service bots to answer customer queries based on documentation, extracting insights from the huge amount of unstructured data in companies, paring through any source of information like books, videos, podcasts etc thereby accelerating your leanings and many more. And this is reflected in the library's growth rates too. Over the last few months, this framework has been growing at a rate of 200% and does not seem like stopping anytime soon.

This is why I was super excited to talk with Jerry and together we discussed how the project got started, the core concepts behind the framework, the mental models to help you wrap your head around LLMs and augment them with your own data, the teams vision for the project and how you can help out and join the movement!

So without further ado, Jerry Liu for you guys!

https://www.youtube.com/watch?v=9iX-iBT_gwQ

Note: This transcript was generated with AWS transcriber and edited with gpt3.5 turbo through Langchain. I was just a proofreader 🙂

Jithin: Welcome to the Exploding Gradients podcast. Today, we have Jerry Liu, the creator of LlamaIndex, a framework that connects large language models with your data. This framework has numerous use cases, such as building a customer service board or extracting information from unstructured data. It can also accelerate learning by passing knowledge from books, blogs, podcasts, and videos. Jerry discusses how he started the project, its core concepts, and the vision for the long and short terms. The framework is growing at a rate of almost 200% month over month, and Jerry explains how listeners can contribute and join the movement. It's great to have Jerry with us, and we discuss how he keeps up with the rapidly evolving field.

Jerry: Well, first off, thank you for having me. I'm thrilled to be here. There's a lot of AI activity happening across various verticals, resulting in numerous new research papers and publications being featured on platforms such as Hacker News, Twitter, and LinkedIn. It's exciting to witness all the builders in this space adding cool demos and features. Though I've been quite busy, it's been an energizing experience due to the constant buzz of activity.

Jithin: That's true. Any tips for catching up? The high rate of revision has made it very challenging. What's your strategy?

Jerry: I spend an undue amount of time on Twitter and Hacker News. I browse articles and market maps that show the current state of AI. That's basically what I track the most.

Jithin: Okay, I understand. But about the LlamaIndex, I'm curious about its initial stage and founding story. Can you briefly share the problem you were trying to solve and how it became the big project it is today?

Jerry: Yeah, totally! I started LlamaIndex a few months ago back in November as a tool to solve the context limit problem for language models. The aim was to build novel structures that could index a larger corpus of data and still feed into the relevant sections of the language model. As a thought exercise, I came up with the idea of a tree index that the language model could use to traverse large amounts of information. The project grew as more people became interested, and now LlamaIndex offers a variety of tools around data injection, indexing, and query, as well as supporting modules for evaluation, output parsing, and token usage optimization.

Jithin: Awesome! One thing that I found interesting is that there are two big projects in the space: LlamaIndex and Langchain. You and Harrison have a lot in common, including Robust Intelligence. Both of you have worked for the same company and built impactful projects. Was there something behind this coincidence?

Jerry: I think it was a coincidence that both of us were interested in space. I had dinner conversations with Harrison in October about language models, and we bounced around some fun ideas. It's funny how this turned out with two big open-source projects now - LlamaIndex and Langchain. Harrison used to be my coworker, so we have a relationship.

Jithin: Nice! So, about the LlamaIndex and initial stories - did you build out the first three? The first attempt was the Tree Index, but how was that? Over time, you added other indexes. But now, what are the most effective go-to indexes in the basic pipeline?

Jerry: There is a clear difference between the project's initial design exercise and its current goal of offering practical value to users. The initial idea was a Tree Index that used the language model to process, organize, and traverse information. However, it had practical limitations and errors that increased with the depth and size of the tree. Nowadays, most people start with a vector organization or embedding-based approach to retrieve documents. From there, the documents can be put into input, prompt, and synthesized to produce a response.

Jithin: I've been surprised by how effective simple embeddings can be. With quick data filtering and cleaning, the results are usually quite good and can provide an 80-90% solution for production.

Jerry: I believe that the tool is easy to use, however, it is common for most people to begin with an embedding approach when starting a project. I caution against relying solely on this technique as it may result in overfitting. Although embedding similarity can be effective for fact-based retrieval or when the information can be found in a specific document section, there are cases where a more structured approach is necessary for querying data. This is where LlamaIndex's generalization of the retrieval and synthesis framework comes into play, including the idea of top-case semantic search as just one example.

it is common for most people to begin with an embedding approach when starting a project. I caution against relying solely on this technique as it may result in overfitting. Although embedding similarity can be effective for fact-based retrieval or when the information can be found in a specific document section, there are cases where a more structured approach is necessary for querying data. This is where LlamaIndex's generalization of the retrieval and synthesis framework comes into play, including the idea of top-case semantic search as just one example.

Jithin: I have a follow-up question. Could you provide some examples to further elaborate on the topic and other solutions for people to build on using different techniques?

Jerry: We have a large section of different use cases for LlamaIndex. Think of it as a general framework that takes in a query and gives a response along with a set of retrieved sources. We manage interactions between your language model and your data for optimal results. We provide tools for summarization and unifying query interface. We also support structured data and defining graphs for different retrieval use cases. Each module provides a certain way of getting results from your data. We provide guidance on building a unified query interface. Our long-term vision is to build a state to perform retrieval and synthesis for different modes.

Jithin: Understood. So the end goal you are building towards is the QueryRouter. Is this the router that maps the different types of queries one could ask LlamaIndex or consult in your documentation? Is this part of your current solution?

Jerry: Yes, you can currently achieve this within LlamaIndex using the index router. We are exploring ways to make it more universal. Essentially, a router is like a node with links to multiple indexes. With our composability framework, you can use the tree index to create a router index and route questions to underlying indexes for problem-solving. It functions like a mini-agent that selects the appropriate index for the task at hand.

Jithin: Understood, the technical aspects are interesting. However, I highly recommend checking out the documentation which offers specific tutorials for various use cases. So do refer to the documentation for more information. Overall, there have been many fascinating use cases and creations built with LlamaIndex due to its ease of starting with just a few lines of code. What are some of the interesting use cases you have seen from the community?

Jerry: Definitely. LlamaIndex helps you quickly access your private data sources, including diverse formats like PDFs, web pages, videos, and audio. We even developed a video-to-text parser, which simplifies the data landscape. You can transform raw, unstructured data into text format and ask questions over it without complicated processing. This offers practical value compared to some repetitive agent systems. LlamaIndex also offers core primitives, like indexes and factor stores, which can create AGI models. Our contributors built cool AGI demos, including medical research, which are available in our repository or Streamline.

Jithin: That was creative, and the BabyAI has also been interesting, but currently in the phase of being an interesting demo. However, the basic components are present - it has memory and planning abilities, much like a human brain. There may be something interesting to come of this, but the core is strong. For a builder creating a product, using LlamaIndex and other frameworks can solve problems without worrying about loading or effectiveness. The challenge however now becomes determining where and how to build the moat, which needs to be something other than just technology.

Jerry: It's a good point that we should offer the core tech for Q&A systems to users without them having to worry about it. Semantic search works well for basic queries, but we want to add more to solve broader queries. We can take responsibility for that. The application layer shouldn't be just a thin wrapper around AI. Developers need to build high-level UXes that encourage different user interactions. It's less about fancy algorithms and more about thinking about the best ways to interact with AI systems these days. There are certain use cases that people enjoy using, like Github's Copilot, and others that feel like an add-on feature. Depending on algorithms can have downsides like being expensive and slow, which can frustrate users.

The application layer shouldn't be just a thin wrapper around AI. Developers need to build high-level UXes that encourage different user interactions. It's less about fancy algorithms and more about thinking about the best ways to interact with AI systems these days. There are certain use cases that people enjoy using, like Github's Copilot, and others that feel like an add-on feature.

Jithin: Understood. So UX is a crucial aspect for builders to consider. Like the Notion AI interface which I feel is more intuitive than ChatGPT's interface because Notion allows for text selection, interaction, and questioning. The data and other aspects of the product become even more significant. Regarding the project's future, what short-term plans are in store? What projects are currently exciting?

Jerry: I'm glad you asked that question. I think v0.6 should have landed by the time of this podcast(It did with a few changes). LlamaIndex offers a high-level API that gets you started in three lines of code, making it easy to ask questions and receive answers. We are refactoring to solve both simple and advanced use cases, with more modularity and customizable flexibility. Our toolset is becoming more principled and enterprise-ready by creating good abstractions for state management and integrating with databases and object stores. We are excited about the idea of a unified query interface for users, where we can configure interactions automatically and provide a clean, high-level interface. The high-level idea is to abstract away complexity and offer building blocks for advanced users to create different components themselves.

Jithin: Got it. In the short term, you expose and modify small components to optimize your index for specific datasets. In the long term, you don't need to worry about modification. By giving your data to LlamaIndex, it will build the right indexes and agents to simplify the complexities and configurations. Then, you can use a single endpoint to ask your questions and get the answers you want.

Jerry: I believe it's a powerful solution to a challenging technical problem. It will offer great value to developers who struggle with data and language model integration for optimal interaction patterns. Simplifying this complexity is our goal.

Jithin: Okay, so you mentioned that there are many community contributions. As someone interested in contributing to LlamaIndex, where would be the best place to start? There are various models and moving parts right now, so what are some areas that need help?

Jerry: We have a contribution guide in the core repository with independent modules that are easy to contribute to. For example, we have over 80 data loaders that can still be expanded. There are many services we want to integrate with, including retrieval, query engine, optimizers to reduce token usage, post processors, tax splitters, evaluation, and output parsers. All of these are described in the documentation. We welcome contributions to core features, bug fixes, documentation updates, and cool experiments. Our new repo, Llama-lab, welcomes cool experiments so check that out. Contributors to core models now also receive a limited edition LlamaIndex t-shirt.

Jithin: There are many components involved in the new Index project, which presents numerous opportunities for contributions. The project has experienced significant growth, making it a viable enterprise solution. As more people become dependent on it, there is a need to identify specific enterprise use cases for the project.

Jerry: It's a good question. Enterprise use cases relate to things that complement the open-source project. Data management and query are key factors to consider, especially for managing large volumes like gigabytes or terabytes of data. It may be beneficial to rely on hosted services rather than running the data locally. Table stakes enterprise features like data access control and user management are necessary for most hosted offerings, but not needed in the open source project. Additionally, optimizing between different model use cases is important and providing a service that profiles model performance can benefit the enterprise version.

Jithin: Out of curiosity, have you had experiences where fine-tuning your embedding has resulted in better performance even though it's smaller and cheaper to run?

Jerry: It's an open question. If you're interested in contributing, more experiments are needed. People are still figuring it out. Personally, I'm interested in a fast and cheap retrieval model that's good at reasoning. OpenAI's language models are popular, with GPT-4 being one of the best. For cheap semantic embedding, Hugging Face and other types of embeddings can be used. It's interesting to explore the tradeoff between performance and cost for a locally hosted embedding model.

Jithin: Understood. We should wrap up soon as it's nearly time. We covered interesting topics and provided all relevant docs and links in the description for further reading. The LlamaIndex community is available for ongoing discussions and any remaining questions. Thank you, Jerry, for the insightful conversation. We gained valuable insights and ideas about the project's direction and components. Excited to see what the future holds. Fingers crossed.

The Past, Present and Future of LlamaIndex

A chat with Jerry Liu the creator