Query group
Extracts individual facts in a document, such as the date of an invoice, the liability limit of an insurance policy, or the destination address of a shipping container delivery. When you configure the Multimodal Engine parameter, this method can extra data from non-text images, such as photographs, charts, or illustrations.
Sensible uses a large language model (LLM) to find data in paragraphs of free text, in images, or in more structured layouts, for example key/value pairs or tables. Create a query group to extract multiple facts that share a context, or are co-located in the document.
For tips and troubleshooting, see Query Group extraction tips.
For more information about how this method works, see Notes.
Parameters
Note: For the full list of parameters available for this method, see Global parameters for methods. The following table only shows parameters most relevant to or specific to this method.
Note You can configure some of the following parameters in both the NLP preprocessor and in a field’s method. If you configure both, the field’s parameter overrides the NLP preprocessor’s parameter. For more information, see Advanced prompt configuration.
Parameters
key | value | description |
---|---|---|
method (required) | object | For this object’s parameters, see the following table. |
anchor | The Anchor parameter is optional for fields that use this method.If you specify an anchor and leave the Multimodal Engine unconfigured or configured with “region”: “automatic” then: - Sensible ignores the anchor if it’s present in the document. - Sensible returns nulls for the fields in this query group if the anchor isn’t present in the document.If you specify an anchor and configure the Multimodal Engine parameter’s region manually, then Sensible creates the prompt’s context relative to the anchor. |
Query group parameters
key | value | description |
---|---|---|
id (required) | queryGroup | |
queries | array of objects | An array of query objects, where each extracts a single fact and outputs a single field. Each query contains the following parameters:id (required) - The ID for the extracted field. description (required) - A free-text question about information in the document. For example, “what’s the policy period?” or “what’s the client’s first and last name?“. For more information about how to write questions (or “prompts”), see Query Group extraction tips. |
chunkScoringText | string | Configures context’s content. For details about context and chunks, see the Notes section.A representative snippet of text from the part of the document where you expect to find the answer to your prompt. Use this parameter to narrow down the page location of the answer to your prompt. For example, if your prompt has multiple candidate answers, and the correct answer is located near unique or distinctive text that’s difficult to incorporate into your question, then specify the distinctive text in this parameter.If specified, Sensible uses this text to find top-scoring chunks. If unspecified, Sensible uses the prompt to score chunks.Sensible recommends that the snippet is specific to the target chunk, semantically similar to the chunk, and structurally similar to the chunk. For example, if the chunk contains a street address formatted with newlines, then provide a snippet with an example street address that contains newlines, like 123 Main Street\nLondon, England. If the chunk contains a street address in a free-text paragraph, then provide an unformatted street address in the snippet. |
multimodalEngine | object | Configure this parameter to: - Extract data from images embedded in a document, for example, photos, charts, or illustrations. - Troubleshoot extracting from complex text layouts, such as overlapping lines, lines between lines, and handwriting. For example, use this as an alternative to the Signature method, the Nearest Checkbox method, the OCR engine, and line preprocessors.This parameter sends an image of the document region containing the target data to a multimodal LLM (GPT-4 Vision Preview), so that you can ask questions about text and non-text images. This bypasses Sensible’s OCR and direct-text extraction processes for the region. Note that this option doesn’t support confidence signals.This parameter has the following parameters:region: The document region to send as an image to the multimodal LLM. Configurable with the following options : - To automatically select the context as the region, specify “region”: “automatic”. If you configure this option for a non-text image, then help Sensible locate the context by including queries in the group that target text near the image, or by specifying the nearby text in the Chunk Scoring Text parameter. - To manually specify a region relative to the field’s anchor, specify the region using the Region method’s parameters, for example:“region”: { “start”: “below”, “width”: 8, “height”: 1.2, “offsetX”: -2.5, “offsetY”: -0.25 } |
confidenceSignals | For information about this parameter, see Advanced prompt configuration. | |
contextDescription | For information about this parameter, see Advanced prompt configuration. | |
pageHinting | For information about this parameter, see Advanced prompt configuration. | |
chunkCount | default: 5 | For information about this parameter, see Advanced prompt configuration. |
chunkSize | default: 0.5 | For information about this parameter, see Advanced prompt configuration. |
chunkOverlapPercentage | default: 0.5 | For information about this parameter, see Advanced prompt configuration. |
pageRange | For information about this parameter, see Advanced prompt configuration. |
Examples
Example: Extract from images
Config
The following example shows extracting structured data from real estate photographs embedded in an offering memorandum document using the Multimodal Engine parameter. It also shows extracting data from text.
Example document
The following image shows the example document used with this example config:
Example document | Download link |
---|
Output
Example: Extract handwriting
The following example shows using a multimodal LLM to extract from a scanned document containing handwriting. For an alternate approach to extracting from this document, see also the Sort Lines example.
Config
Example document
The following image shows the example document used with this example config:
Example document | Download link |
---|
Output
Example: Extract from lease
The following example shows using the Query Group method to extract information from a lease.
Config
Example document
The following image shows the example document used with this example config:
Example document | Download link |
---|
Output
Notes
For an overview of how this method works, see the following steps:
- To meet the LLM’s token limit for input, Sensible splits the document into equal-sized, overlapping chunks.
- Sensible scores each chunk by its similarity to either the concatenated Description parameters for the queries in the group, or by the
chunkScoringText
parameter. Sensible scores each chunk using the OpenAPI Embeddings API. - Sensible selects a number of the top-scoring chunks and combines them into “context”. The chunks can be non-consecutive in the document. Sensible deduplicates overlapping text in consecutive chunks. If you set chunk-related parameters that cause the context to exceed the LLM’s token limit, Sensible automatically reduces the chunk count until the context meets the token limit.
- Sensible creates a full prompt for the LLM (GPT-3.5 Turbo) that includes the chunks, page hinting data, and your Description parameters. For more information about the full prompt, see Advanced prompt configuration.
How location highlighting works
In the Sensible Instruct editor, you can click the search icon to the right of the output of a query field to view its source text in the document.
For an overview of how Sensible finds the source text in the document for the LLM’s response, see the following steps:
- The LLM returns a response to your prompt.
- Sensible searches in the source document for a line that’s a fuzzy match to the response. For example, if the LLM returns
4387-09-22-33
, Sensible matches the linePolicy Number: 4387-09-22-33
in the document. Sensible implements fuzzy matching using Levenshtien distance. - Sensible selects the three lines in the document that contain the best fuzzy matches. For each line, Sensible concatenates the preceding and succeeding lines, in case the match spans multiple lines.
- Sensible searches for a fuzzy match in the concatenated lines for the text that the LLM returned. Sensible returns the best match.
- Sensible highlights the best match in the document in the Sensible Instruct editor or in the SenseML editor.
Limitations
Sensible can highlight the incorrect location under the following circumstances:
- If you prompt the LLM to reformat the source text in the document or reformat the text using a type , then Sensible can fail to find a match or can find an inaccurate match.
- If there are multiple candidates fuzzy matches in the document (for example, two instances of
April 7
), Sensible chooses the top-scoring match. If candidates have similar scores, Sensible uses page location as a tie breaker and chooses the earliest match in the document. - If the LLM returns text that’s not in the document, then location highlighting is inapplicable.