quivr/backend/packages/files/parsers
Stan Girard 2be6aac02a
feat(embedding): keeping citations (#2506)
This pull request updates the chunk size and overlap parameters in the
File class to improve performance. It also increases the top_n value in
the compressor for both the CohereRerank and FlashrankRerank models.
Additionally, it ensures that the page content is encoded in UTF-8
before processing.
2024-04-27 05:18:51 -07:00
..
__init__.py refactor: create "files" package (#1626) 2023-11-14 09:52:44 +01:00
audio.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
bibtex.py Feat: Bibtex file uploads (#2398) 2024-04-02 10:51:16 -07:00
code_python.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
common.py feat(embedding): keeping citations (#2506) 2024-04-27 05:18:51 -07:00
csv.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
docx.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
epub.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
github.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
html.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
markdown.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
notebook.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
odt.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
pdf.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
powerpoint.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
telegram.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
txt.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00
xlsx.py feat(integrations): integration with Notion in the backend (#2123) 2024-02-05 21:02:46 -08:00