feat(llamaparse): Update parsing instructions in common.py (#2627)

This pull request updates the parsing instructions in the `common.py`
file for the `llamaparse` feature. The previous parsing instruction for
transforming checkboxes into text has been modified to also extract
tables and transform them into key-value pairs. Additionally, the
instruction now allows for duplicate keys if needed. The example
instructions have also been updated to provide clearer examples for both
tables and checkboxes.
This commit is contained in:
Stan Girard 2024-05-29 11:25:15 +02:00 committed by GitHub
parent da9a3c1897
commit a89db0cd5a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -45,7 +45,7 @@ async def process_file(
parser = LlamaParse(
result_type="markdown", # "markdown" and "text" are available
parsing_instruction="Extract the tables and checkboxes. Transform tables to key = value. You can duplicates Keys if needed. For example: Productions Fonts = 300 productions Fonts Company Desktop License = Yes for Maximum of 60 Licensed Desktop users For example checkboxes should be: Premium Activated = Yes License Premier = No If a checkbox is present for a table with multiple options. Say Yes for the one activated and no for the one not activated.Format using headers.",
parsing_instruction="Extract the tables and transform checkboxes into text. Transform tables to key = value. You can duplicates Keys if needed. For example: Productions Fonts = 300 productions Fonts Company Desktop License = Yes for Maximum of 60 Licensed Desktop users For example checkboxes should be: Premium Activated = Yes License Premier = No If a checkbox is present for a table with multiple options. Say Yes for the one activated and no for the one not activated. Format using headers.",
gpt4o_mode=True,
gpt4o_api_key=os.getenv("OPENAI_API_KEY"),
)