feat: Improve file loading logic in File model (#2861)

The code changes in `files.py` improve the file loading logic in the
File model. The `load()` method now returns the loaded content, which is
then assigned to the `documents` variable. Additionally, the logger now
includes information about the loaded documents.

# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):

Co-authored-by: Stan Girard <stan@quivr.app>
This commit is contained in:
Stan Girard 2024-07-15 12:10:14 +02:00 committed by GitHub
parent afe17a7163
commit 37793d549a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -42,8 +42,10 @@ class File(BaseModel):
"""
logger.info(f"Computing documents from file {self.file_name}")
loader = loader_class(self.tmp_file_path)
documents = []
documents.extend(loader.load())
loaded_content = loader.load()
documents = (
[loaded_content] if not isinstance(loaded_content, list) else loaded_content
)
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
chunk_size=self.chunk_size, chunk_overlap=self.chunk_overlap