quivr/backend/api/quivr_api/main.py

132 lines
4.4 KiB
Python
Raw Normal View History

import logging
feat(upload): async improved (#2544) # Description Hey, Here's a breakdown of what I've done: - Reducing the number of opened fd and memory footprint: Previously, for each uploaded file, we were opening a temporary NamedTemporaryFile to write existing content read from Supabase. However, due to the dependency on `langchain` loader classes, we couldn't use memory buffers for the loaders. Now, with the changes made, we only open a single temporary file for each `process_file_and_notify`, cutting down on excessive file opening, read syscalls, and memory buffer usage. This could cause stability issues when ingesting and processing large volumes of documents. Unfortunately, there is still reopening of temporary files in some code paths but this can be improved further in later work. - Removing `UploadFile` class from File: The `UploadFile` ( a FastAPI abstraction over a SpooledTemporaryFile for multipart upload) was redundant in our `File` setup since we already downloaded the file from remote storage and read it into memory + wrote the file into a temp file. By removing this abstraction, we streamline our code and eliminate unnecessary complexity. - `async` function Adjustments: I've removed the async labeling from functions where it wasn't truly asynchronous. For instance, calling `filter_file` for processing files isn't genuinely async, ass async file reading isn't actually asynchronous—it [uses a threadpool for reading the file](https://github.com/encode/starlette/blob/9f16bf5c25e126200701f6e04330864f4a91a898/starlette/datastructures.py#L458) . Given that we're already leveraging `celery` for parallelism (one worker per core), we need to ensure that reading and processing occur in the same thread, or at least minimize thread spawning. Additionally, since the rest of the code isn't inherently asynchronous, our bottleneck lies in CPU operations rather than asynchronous processing. These changes aim to improve performance and streamline our codebase. Let me know if you have any questions or suggestions for further improvements! ## Checklist before requesting a review - [x] My code follows the style guidelines of this project - [x] I have performed a self-review of my code - [x] I have ideally added tests that prove my fix is effective or that my feature works --------- Signed-off-by: aminediro <aminediro@github.com> Co-authored-by: aminediro <aminediro@github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com>
2024-06-04 16:29:27 +03:00
import os
import litellm
import sentry_sdk
feat(upload): async improved (#2544) # Description Hey, Here's a breakdown of what I've done: - Reducing the number of opened fd and memory footprint: Previously, for each uploaded file, we were opening a temporary NamedTemporaryFile to write existing content read from Supabase. However, due to the dependency on `langchain` loader classes, we couldn't use memory buffers for the loaders. Now, with the changes made, we only open a single temporary file for each `process_file_and_notify`, cutting down on excessive file opening, read syscalls, and memory buffer usage. This could cause stability issues when ingesting and processing large volumes of documents. Unfortunately, there is still reopening of temporary files in some code paths but this can be improved further in later work. - Removing `UploadFile` class from File: The `UploadFile` ( a FastAPI abstraction over a SpooledTemporaryFile for multipart upload) was redundant in our `File` setup since we already downloaded the file from remote storage and read it into memory + wrote the file into a temp file. By removing this abstraction, we streamline our code and eliminate unnecessary complexity. - `async` function Adjustments: I've removed the async labeling from functions where it wasn't truly asynchronous. For instance, calling `filter_file` for processing files isn't genuinely async, ass async file reading isn't actually asynchronous—it [uses a threadpool for reading the file](https://github.com/encode/starlette/blob/9f16bf5c25e126200701f6e04330864f4a91a898/starlette/datastructures.py#L458) . Given that we're already leveraging `celery` for parallelism (one worker per core), we need to ensure that reading and processing occur in the same thread, or at least minimize thread spawning. Additionally, since the rest of the code isn't inherently asynchronous, our bottleneck lies in CPU operations rather than asynchronous processing. These changes aim to improve performance and streamline our codebase. Let me know if you have any questions or suggestions for further improvements! ## Checklist before requesting a review - [x] My code follows the style guidelines of this project - [x] I have performed a self-review of my code - [x] I have ideally added tests that prove my fix is effective or that my feature works --------- Signed-off-by: aminediro <aminediro@github.com> Co-authored-by: aminediro <aminediro@github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com>
2024-06-04 16:29:27 +03:00
from dotenv import load_dotenv # type: ignore
from fastapi import FastAPI, HTTPException, Request
from fastapi.responses import HTMLResponse, JSONResponse
from pyinstrument import Profiler
from quivr_api.logger import get_logger
from quivr_api.middlewares.cors import add_cors_middleware
from quivr_api.modules.analytics.controller.analytics_routes import analytics_router
from quivr_api.modules.api_key.controller import api_key_router
from quivr_api.modules.assistant.controller import assistant_router
from quivr_api.modules.brain.controller import brain_router
from quivr_api.modules.chat.controller import chat_router
from quivr_api.modules.knowledge.controller import knowledge_router
from quivr_api.modules.misc.controller import misc_router
from quivr_api.modules.models.controller.model_routes import model_router
from quivr_api.modules.onboarding.controller import onboarding_router
from quivr_api.modules.prompt.controller import prompt_router
from quivr_api.modules.sync.controller import sync_router
from quivr_api.modules.upload.controller import upload_router
from quivr_api.modules.user.controller import user_router
from quivr_api.packages.utils import handle_request_validation_error
from quivr_api.packages.utils.telemetry import maybe_send_telemetry
from quivr_api.routes.crawl_routes import crawl_router
from quivr_api.routes.subscription_routes import subscription_router
from sentry_sdk.integrations.fastapi import FastApiIntegration
from sentry_sdk.integrations.starlette import StarletteIntegration
feat(upload): async improved (#2544) # Description Hey, Here's a breakdown of what I've done: - Reducing the number of opened fd and memory footprint: Previously, for each uploaded file, we were opening a temporary NamedTemporaryFile to write existing content read from Supabase. However, due to the dependency on `langchain` loader classes, we couldn't use memory buffers for the loaders. Now, with the changes made, we only open a single temporary file for each `process_file_and_notify`, cutting down on excessive file opening, read syscalls, and memory buffer usage. This could cause stability issues when ingesting and processing large volumes of documents. Unfortunately, there is still reopening of temporary files in some code paths but this can be improved further in later work. - Removing `UploadFile` class from File: The `UploadFile` ( a FastAPI abstraction over a SpooledTemporaryFile for multipart upload) was redundant in our `File` setup since we already downloaded the file from remote storage and read it into memory + wrote the file into a temp file. By removing this abstraction, we streamline our code and eliminate unnecessary complexity. - `async` function Adjustments: I've removed the async labeling from functions where it wasn't truly asynchronous. For instance, calling `filter_file` for processing files isn't genuinely async, ass async file reading isn't actually asynchronous—it [uses a threadpool for reading the file](https://github.com/encode/starlette/blob/9f16bf5c25e126200701f6e04330864f4a91a898/starlette/datastructures.py#L458) . Given that we're already leveraging `celery` for parallelism (one worker per core), we need to ensure that reading and processing occur in the same thread, or at least minimize thread spawning. Additionally, since the rest of the code isn't inherently asynchronous, our bottleneck lies in CPU operations rather than asynchronous processing. These changes aim to improve performance and streamline our codebase. Let me know if you have any questions or suggestions for further improvements! ## Checklist before requesting a review - [x] My code follows the style guidelines of this project - [x] I have performed a self-review of my code - [x] I have ideally added tests that prove my fix is effective or that my feature works --------- Signed-off-by: aminediro <aminediro@github.com> Co-authored-by: aminediro <aminediro@github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com>
2024-06-04 16:29:27 +03:00
load_dotenv()
# Set the logging level for all loggers to WARNING
logging.basicConfig(level=logging.INFO)
logging.getLogger("httpx").setLevel(logging.WARNING)
logging.getLogger("LiteLLM").setLevel(logging.WARNING)
logging.getLogger("litellm").setLevel(logging.WARNING)
get_logger("quivr_core")
litellm.set_verbose = False
New Webapp migration (#56) * feat(v2): loaders added * feature: Add scroll animations * feature: upload ui * feature: upload multiple files * fix: Same file name and size remove * feat(crawler): added * feat(parsers): v2 added more * feat(v2): audio now working * feat(v2): all loaders * feat(v2): explorer * chore: add links * feat(api): added status in return message * refactor(website): remove old code * feat(upload): return type for messages * feature: redirect to upload if ENV=local * fix(chat): fixed some issues * feature: respect response type * loading state * feature: Loading stat * feat(v2): added explore and chat pages * feature: modal settings * style: Chat UI * feature: scroll to bottom when chatting * feature: smooth scroll in chat * feature(anim): Slide chat in * feature: markdown chat * feat(explorer): list * feat(doc): added document item * feat(explore): added modal * Add clarification on Project API keys and web interface for migration scripts to Readme (#58) * fix(demo): changed link * add support to uploading zip file (#62) * Catch UnicodeEncodeError exception (#64) * feature: fixed chatbar * fix(loaders): missing argument * fix: layout * fix: One whole chatbox * fix: Scroll into view * fix(build): vercel issues * chore(streamlit): moved to own file * refactor(api): moved to backend folder * feat(docker): added docker compose * Fix a bug where langchain memories were not being cleaned (#71) * Update README.md (#70) * chore(streamlit): moved to own file * refactor(api): moved to backend folder * docs(readme): updated for new version * docs(readme): added old readme * docs(readme): update copy dot env file * docs(readme): cleanup --------- Co-authored-by: iMADi-ARCH <nandanaditya985@gmail.com> Co-authored-by: Matt LeBel <github@lebel.io> Co-authored-by: Evan Carlson <45178375+EvanCarlson@users.noreply.github.com> Co-authored-by: Mustafa Hasan Khan <65130881+mustafahasankhan@users.noreply.github.com> Co-authored-by: zhulixi <48713110+zlxxlz1026@users.noreply.github.com> Co-authored-by: Stanisław Tuszyński <stanislaw@tuszynski.me>
2023-05-21 02:20:55 +03:00
logger = get_logger(__name__)
def before_send(event, hint):
# If this is a transaction event
if event["type"] == "transaction":
# And the transaction name contains 'healthz'
if "healthz" in event["transaction"]:
# Drop the event by returning None
return None
# For other events, return them as is
return event
2023-07-14 22:02:26 +03:00
sentry_dsn = os.getenv("SENTRY_DSN")
if sentry_dsn:
2023-07-01 22:12:13 +03:00
sentry_sdk.init(
2023-07-14 22:02:26 +03:00
dsn=sentry_dsn,
sample_rate=0.1,
enable_tracing=True,
traces_sample_rate=0.1,
integrations=[
StarletteIntegration(transaction_style="url"),
FastApiIntegration(transaction_style="url"),
],
before_send=before_send,
2023-07-01 22:12:13 +03:00
)
New Webapp migration (#56) * feat(v2): loaders added * feature: Add scroll animations * feature: upload ui * feature: upload multiple files * fix: Same file name and size remove * feat(crawler): added * feat(parsers): v2 added more * feat(v2): audio now working * feat(v2): all loaders * feat(v2): explorer * chore: add links * feat(api): added status in return message * refactor(website): remove old code * feat(upload): return type for messages * feature: redirect to upload if ENV=local * fix(chat): fixed some issues * feature: respect response type * loading state * feature: Loading stat * feat(v2): added explore and chat pages * feature: modal settings * style: Chat UI * feature: scroll to bottom when chatting * feature: smooth scroll in chat * feature(anim): Slide chat in * feature: markdown chat * feat(explorer): list * feat(doc): added document item * feat(explore): added modal * Add clarification on Project API keys and web interface for migration scripts to Readme (#58) * fix(demo): changed link * add support to uploading zip file (#62) * Catch UnicodeEncodeError exception (#64) * feature: fixed chatbar * fix(loaders): missing argument * fix: layout * fix: One whole chatbox * fix: Scroll into view * fix(build): vercel issues * chore(streamlit): moved to own file * refactor(api): moved to backend folder * feat(docker): added docker compose * Fix a bug where langchain memories were not being cleaned (#71) * Update README.md (#70) * chore(streamlit): moved to own file * refactor(api): moved to backend folder * docs(readme): updated for new version * docs(readme): added old readme * docs(readme): update copy dot env file * docs(readme): cleanup --------- Co-authored-by: iMADi-ARCH <nandanaditya985@gmail.com> Co-authored-by: Matt LeBel <github@lebel.io> Co-authored-by: Evan Carlson <45178375+EvanCarlson@users.noreply.github.com> Co-authored-by: Mustafa Hasan Khan <65130881+mustafahasankhan@users.noreply.github.com> Co-authored-by: zhulixi <48713110+zlxxlz1026@users.noreply.github.com> Co-authored-by: Stanisław Tuszyński <stanislaw@tuszynski.me>
2023-05-21 02:20:55 +03:00
app = FastAPI()
add_cors_middleware(app)
app.include_router(brain_router)
Feat/user chat history (#275) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * 🩹 add setQuestion to speach utils * feat: self supplied key (#286) * feat(brain): increased size if api key and more * fix(key): not displayed * feat(apikey): now password input * fix(twitter): handle wrong * feat(chat): basic source documents support (#289) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * 🩹 add setQuestion to speach utils * 🎨 separate creation and update endpoints for chat * 🩹 add feat(chat): basic source documents support * ✨ add chatName upon creation and for chats list * 💄 improve chatsList style * User chat history and multiple chats (#290) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * refactor(chat): use layout to avoid refetching all chats on every chat * refactor(chat): useChats hook instead of useQuestion * fix(chat): fix errors * refactor(chat): better folder structure * feat: self supplied key (#286) * feat(brain): increased size if api key and more * fix(key): not displayed * feat(apikey): now password input * fix(twitter): handle wrong * feat(chat): basic source documents support (#289) * style(chat): better looking sidebar * resume merge * fix(backend): add os and logger imports * small fixes * chore(chat): remove empty interface * chore(chat): use const * fix(chat): merge errors * fix(chat): remove useSpeech args * chore(chat): remove unused args * fix(chat): remove duplicate components --------- Co-authored-by: gozineb <zinebe@theodo.fr> Co-authored-by: Matt <77928207+mattzcarey@users.noreply.github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com> Co-authored-by: xleven <xleven@outlook.com> * fix and refactor errors * fix(fresh): installation issues * chore(conflict): merged old code * fix(multi-chat): use update endpoint * feat(embeddings): now using users api key --------- Co-authored-by: Matt <77928207+mattzcarey@users.noreply.github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com> Co-authored-by: xleven <xleven@outlook.com> Co-authored-by: Aditya Nandan <61308761+iMADi-ARCH@users.noreply.github.com> Co-authored-by: iMADi-ARCH <nandanaditya985@gmail.com> Co-authored-by: Mamadou DICKO <mamadoudicko100@gmail.com>
2023-06-11 00:59:16 +03:00
app.include_router(chat_router)
app.include_router(crawl_router)
app.include_router(assistant_router)
app.include_router(sync_router)
app.include_router(onboarding_router)
Feat/user chat history (#275) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * 🩹 add setQuestion to speach utils * feat: self supplied key (#286) * feat(brain): increased size if api key and more * fix(key): not displayed * feat(apikey): now password input * fix(twitter): handle wrong * feat(chat): basic source documents support (#289) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * 🩹 add setQuestion to speach utils * 🎨 separate creation and update endpoints for chat * 🩹 add feat(chat): basic source documents support * ✨ add chatName upon creation and for chats list * 💄 improve chatsList style * User chat history and multiple chats (#290) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * refactor(chat): use layout to avoid refetching all chats on every chat * refactor(chat): useChats hook instead of useQuestion * fix(chat): fix errors * refactor(chat): better folder structure * feat: self supplied key (#286) * feat(brain): increased size if api key and more * fix(key): not displayed * feat(apikey): now password input * fix(twitter): handle wrong * feat(chat): basic source documents support (#289) * style(chat): better looking sidebar * resume merge * fix(backend): add os and logger imports * small fixes * chore(chat): remove empty interface * chore(chat): use const * fix(chat): merge errors * fix(chat): remove useSpeech args * chore(chat): remove unused args * fix(chat): remove duplicate components --------- Co-authored-by: gozineb <zinebe@theodo.fr> Co-authored-by: Matt <77928207+mattzcarey@users.noreply.github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com> Co-authored-by: xleven <xleven@outlook.com> * fix and refactor errors * fix(fresh): installation issues * chore(conflict): merged old code * fix(multi-chat): use update endpoint * feat(embeddings): now using users api key --------- Co-authored-by: Matt <77928207+mattzcarey@users.noreply.github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com> Co-authored-by: xleven <xleven@outlook.com> Co-authored-by: Aditya Nandan <61308761+iMADi-ARCH@users.noreply.github.com> Co-authored-by: iMADi-ARCH <nandanaditya985@gmail.com> Co-authored-by: Mamadou DICKO <mamadoudicko100@gmail.com>
2023-06-11 00:59:16 +03:00
app.include_router(misc_router)
app.include_router(analytics_router)
Feat/user chat history (#275) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * 🩹 add setQuestion to speach utils * feat: self supplied key (#286) * feat(brain): increased size if api key and more * fix(key): not displayed * feat(apikey): now password input * fix(twitter): handle wrong * feat(chat): basic source documents support (#289) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * 🩹 add setQuestion to speach utils * 🎨 separate creation and update endpoints for chat * 🩹 add feat(chat): basic source documents support * ✨ add chatName upon creation and for chats list * 💄 improve chatsList style * User chat history and multiple chats (#290) * ♻️ refactor backend main routes * 🗃️ new user_id uuid column in users table * 🗃️ new chats table * ✨ new chat endpoints * ✨ change chat routes post to handle undef chat_id * ♻️ extract components from chat page * ✨ add chatId to useQuestion * ✨ new ChatsList * ✨ new optional dynamic route chat/{chat_id} * refactor(chat): use layout to avoid refetching all chats on every chat * refactor(chat): useChats hook instead of useQuestion * fix(chat): fix errors * refactor(chat): better folder structure * feat: self supplied key (#286) * feat(brain): increased size if api key and more * fix(key): not displayed * feat(apikey): now password input * fix(twitter): handle wrong * feat(chat): basic source documents support (#289) * style(chat): better looking sidebar * resume merge * fix(backend): add os and logger imports * small fixes * chore(chat): remove empty interface * chore(chat): use const * fix(chat): merge errors * fix(chat): remove useSpeech args * chore(chat): remove unused args * fix(chat): remove duplicate components --------- Co-authored-by: gozineb <zinebe@theodo.fr> Co-authored-by: Matt <77928207+mattzcarey@users.noreply.github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com> Co-authored-by: xleven <xleven@outlook.com> * fix and refactor errors * fix(fresh): installation issues * chore(conflict): merged old code * fix(multi-chat): use update endpoint * feat(embeddings): now using users api key --------- Co-authored-by: Matt <77928207+mattzcarey@users.noreply.github.com> Co-authored-by: Stan Girard <girard.stanislas@gmail.com> Co-authored-by: xleven <xleven@outlook.com> Co-authored-by: Aditya Nandan <61308761+iMADi-ARCH@users.noreply.github.com> Co-authored-by: iMADi-ARCH <nandanaditya985@gmail.com> Co-authored-by: Mamadou DICKO <mamadoudicko100@gmail.com>
2023-06-11 00:59:16 +03:00
app.include_router(upload_router)
app.include_router(user_router)
app.include_router(api_key_router)
app.include_router(subscription_router)
app.include_router(prompt_router)
app.include_router(knowledge_router)
app.include_router(model_router)
PROFILING = os.getenv("PROFILING", "false").lower() == "true"
if PROFILING:
@app.middleware("http")
async def profile_request(request: Request, call_next):
profiling = request.query_params.get("profile", False)
if profiling:
profiler = Profiler()
profiler.start()
await call_next(request)
profiler.stop()
return HTMLResponse(profiler.output_html())
else:
return await call_next(request)
2023-07-14 22:02:26 +03:00
@app.exception_handler(HTTPException)
async def http_exception_handler(_, exc):
return JSONResponse(
status_code=exc.status_code,
content={"detail": exc.detail},
)
handle_request_validation_error(app)
if os.getenv("TELEMETRY_ENABLED") == "true":
logger.info("Telemetry enabled, we use telemetry to collect anonymous usage data.")
logger.info(
"To disable telemetry, set the TELEMETRY_ENABLED environment variable to false."
)
maybe_send_telemetry("booting", {"status": "ok"})
maybe_send_telemetry("ping", {"ping": "pong"})
if __name__ == "__main__":
# run main.py to debug backend
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=5050, log_level="debug", access_log=False)