quivr/core/tests/fixture_chunks.py
Jacopo Chevallard 285fe5b960
feat: websearch, tool use, user intent, dynamic retrieval, multiple questions (#3424)
# Description

This PR includes far too many new features:

- detection of user intent (closes CORE-211)
- treating multiple questions in parallel (closes CORE-212)
- using the chat history when answering a question (closes CORE-213)
- filtering of retrieved chunks by relevance threshold (closes CORE-217)
- dynamic retrieval of chunks (closes CORE-218)
- enabling web search via Tavily (closes CORE-220)
- enabling agent / assistant to activate tools when relevant to complete
the user task (closes CORE-224)

Also closes CORE-205

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):

---------

Co-authored-by: Stan Girard <stan@quivr.app>
2024-10-31 17:57:54 +01:00

53 lines
1.8 KiB
Python

import asyncio
import json
from uuid import uuid4
from langchain_core.embeddings import DeterministicFakeEmbedding
from langchain_core.messages.ai import AIMessageChunk
from langchain_core.vectorstores import InMemoryVectorStore
from quivr_core.rag.entities.chat import ChatHistory
from quivr_core.rag.entities.config import LLMEndpointConfig, RetrievalConfig
from quivr_core.llm import LLMEndpoint
from quivr_core.rag.quivr_rag_langgraph import QuivrQARAGLangGraph
async def main():
retrieval_config = RetrievalConfig(llm_config=LLMEndpointConfig(model="gpt-4o"))
embedder = DeterministicFakeEmbedding(size=20)
vec = InMemoryVectorStore(embedder)
llm = LLMEndpoint.from_config(retrieval_config.llm_config)
chat_history = ChatHistory(uuid4(), uuid4())
rag_pipeline = QuivrQARAGLangGraph(
retrieval_config=retrieval_config, llm=llm, vector_store=vec
)
conversational_qa_chain = rag_pipeline.build_chain()
with open("response.jsonl", "w") as f:
async for event in conversational_qa_chain.astream_events(
{
"messages": [
("user", "What is NLP, give a very long detailed answer"),
],
"chat_history": chat_history,
"custom_personality": None,
},
version="v1",
config={"metadata": {}},
):
kind = event["event"]
if (
kind == "on_chat_model_stream"
and event["metadata"]["langgraph_node"] == "generate"
):
chunk = event["data"]["chunk"]
dict_chunk = {
k: v.dict() if isinstance(v, AIMessageChunk) else v
for k, v in chunk.items()
}
f.write(json.dumps(dict_chunk) + "\n")
asyncio.run(main())