To effectively convey complex information, organizations increasingly rely on visual documentation through diagrams, charts, and technical illustrations. Although text documents are well-integrated into modern knowledge management systems, rich information contained in diagrams, charts, technical schematics, and visual documentation often remains inaccessible to search and AI assistants. This creates significant gaps in organizational knowledge bases, leading to interpreting visual data manually and preventing automation systems from using critical visual information for comprehensive insights and decision-making. While Amazon Q Business already handles embedded images within documents, the custom document enrichment (CDE) feature extends these capabilities significantly by processing standalone image files (for example, JPGs and PNGs).
In this post, we look at a step-by-step implementation for using the CDE feature within an Amazon Q Business application. We walk you through an AWS Lambda function configured within CDE to process various image file types, and we showcase an example scenario of how this integration enhances the Amazon Q Business ability to provide comprehensive insights. By following this practical guide, you can significantly expand your organization’s searchable knowledge base, enabling more complete answers and insights that incorporate both textual and visual information sources.
Example scenario: Analyzing regional educational demographics
Consider a scenario where you’re working for a national educational consultancy that has charts, graphs, and demographic data across different AWS Regions stored in an Amazon Simple Storage Service (Amazon S3) bucket. The following image shows student distribution by age range across various cities using a bar chart. The insights in visualizations like this are valuable for decision-making but traditionally locked within image formats in your S3 buckets and other storage.
With Amazon Q Business and CDE, we show you how to enable natural language queries against such visualizations. For example, your team could ask questions such as “Which city has the highest number of students in the 13–15 age range?” or “Compare the student demographics between City 1 and City 4” directly through the Amazon Q Business application interface.
You can bridge this gap using the Amazon Q Business CDE feature to:
- Detect and process image files during the document ingestion process
- Use Amazon Bedrock with AWS Lambda to interpret the visual information
- Extract structured data and insights from charts and graphs
- Make this information searchable using natural language queries
Solution overview
In this solution, we walk you through how to implement a CDE-based solution for your educational demographic data visualizations. The solution empowers organizations to extract meaningful information from image files using the CDE capability of Amazon Q Business. When Amazon Q Business encounters the S3 path during ingestion, CDE rules automatically trigger a Lambda function. The Lambda function identifies the image files and calls the Amazon Bedrock API, which uses multimodal large language models (LLMs) to analyze and extract contextual information from each image. The extracted text is then seamlessly integrated into the knowledge base in Amazon Q Business. End users can then quickly search for valuable data and insights from images based on their actual context. By bridging the gap between visual content and searchable text, this solution helps organizations unlock valuable insights previously hidden within their image repositories.
The following figure shows the high-level architecture diagram used for this solution.
For this use case, we use Amazon S3 as our data source. However, this same solution is adaptable to other data source types supported by Amazon Q Business, or it can be implemented with custom data sources as needed.To complete the solution, follow these high-level implementation steps:
- Create an Amazon Q Business application and sync with an S3 bucket.
- Configure the Amazon Q Business application CDE for the Amazon S3 data source.
- Extract context from the images.
Prerequisites
The following prerequisites are needed for implementation:
- An AWS account.
- At least one Amazon Q Business Pro user that has admin permissions to set up and configure Amazon Q Business. For pricing information, refer to Amazon Q Business pricing.
- AWS Identity and Access Management (IAM) permissions to create and manage IAM roles and policies.
- A supported data source to connect, such as an S3 bucket containing your public documents.
- Access to an Amazon Bedrock LLM in the required AWS Region.
Create an Amazon Q Business application and sync with an S3 bucket
To create an Amazon Q Business application and connect it to your S3 bucket, complete the following steps. These steps provide a general overview of how to create an Amazon Q Business application and synchronize it with an S3 bucket. For more comprehensive, step-by-step guidance, follow the detailed instructions in the blog post Discover insights from Amazon S3 with Amazon Q S3 connector.
- Initiate your application setup through either the AWS Management Console or AWS Command Line Interface (AWS CLI).
- Create an index for your Amazon Q Business application.
- Use the built-in Amazon S3 connector to link your application with documents stored in your organization’s S3 buckets.
Configure the Amazon Q Business application CDE for the Amazon S3 data source
With the CDE feature of Amazon Q Business, you can make the most of your Amazon S3 data sources by using the sophisticated capabilities to modify, enhance, and filter documents during the ingestion process, ultimately making enterprise content more discoverable and valuable. When connecting Amazon Q Business to S3 repositories, you can use CDE to seamlessly transform your raw data, applying modifications that significantly improve search quality and information accessibility. This powerful functionality extends to extracting context from binary files such as images through integration with Amazon Bedrock services, enabling organizations to unlock insights from previously inaccessible content formats. By implementing CDE for Amazon S3 data sources, businesses can maximize the utility of their enterprise data within Amazon Q, creating a more comprehensive and intelligent knowledge base that responds effectively to user queries.To configure the Amazon Q Business application CDE for the Amazon S3 data source, complete the following steps:
- Select your application and navigate to Data sources.
- Choose your existing Amazon S3 data source or create a new one. Verify that Audio/Video under Multi-media content configuration is not enabled.
- In the data source configuration, locate the Custom Document Enrichment section.
- Configure the pre-extraction rules to trigger a Lambda function when specific S3 bucket conditions are satisfied. Check the following screenshot for an example configuration.
Pre-extraction rules are executed before Amazon Q Business processes files from your S3 bucket.
Extract context from the images
To extract insights from an image file, the Lambda function makes an Amazon Bedrock API call using Anthropic’s Claude 3.7 Sonnet model. You can modify the code to use other Amazon Bedrock models based on your use case.
Constructing the prompt is a critical piece of the code. We recommend trying various prompts to get the desired output for your use case. Amazon Bedrock offers the capability to optimize a prompt that you can use to enhance your use case specific input.
Examine the following Lambda function code snippets, written in Python, to understand the Amazon Bedrock model setup along with a sample prompt to extract insights from an image.
In the following code snippet, we start by importing relevant Python libraries, define constants, and initialize AWS SDK for Python (Boto3) clients for Amazon S3 and Amazon Bedrock runtime. For more information, refer to the Boto3 documentation.
The prompt passed to the Amazon Bedrock model, Anthropic’s Claude 3.7 Sonnet in this case, is broken into two parts: prompt_prefix
and prompt_suffix
. The prompt breakdown makes it more readable and manageable. Additionally, the Amazon Bedrock prompt caching feature can be used to reduce response latency as well as input token cost. You can modify the prompt to extract information based on your specific use case as needed.
The lambda_handler
is the main entry point for the Lambda function. While invoking this Lambda function, the CDE passes the data source’s information within event
object input. In this case, the S3 bucket and the S3 object key are retrieved from the event
object along with the file format. Further processing of the input happens only if the file_format
matches the expected file types. For production ready code, implement proper error handling for unexpected errors.
The generate_image_description
function calls two other functions: first to construct the message that is passed to the Amazon Bedrock model and second to invoke the model. It returns the final text output extracted from the image file by the model invocation.
The _llm_input
function takes in the S3 object’s details passed as input along with the file type (png
, jpg
) and builds the message in the format expected by the model invoked by Amazon Bedrock.
The _invoke_model
function calls the converse
API using the Amazon Bedrock runtime client. This API returns the response generated by the model. The values within inferenceConfig
settings for maxTokens
and temperature
are used to limit the length of the response and make the responses more deterministic (less random) respectively.
Putting all the preceding code pieces together, the full Lambda function code is shown in the following block:
We strongly recommend testing and validating code in a nonproduction environment before deploying it to production. In addition to Amazon Q pricing, this solution will incur charges for AWS Lambda and Amazon Bedrock. For more information, refer to AWS Lambda pricing and Amazon Bedrock pricing.
After the Amazon S3 data is synced with the Amazon Q index, you can prompt the Amazon Q Business application to get the extracted insights as shown in the following section.
Example prompts and results
The following question and answer pairs refer the Student Age Distribution graph at the beginning of this post.
Q: Which City has the highest number of students in the 13-15 age range?
Q: Compare the student demographics between City 1 and City 4?
In the original graph, the bars representing student counts lacked explicit numerical labels, which could make data interpretation challenging on a scale. However, with Amazon Q Business and its integration capabilities, this limitation can be overcome. By using Amazon Q Business to process these visualizations with Amazon Bedrock LLMs using the CDE feature, we’ve enabled a more interactive and insightful analysis experience. The service effectively extracts the contextual information embedded in the graph, even when explicit labels are absent. This powerful combination means that end users can ask questions about the visualization and receive responses based on the underlying data. Rather than being limited by what’s explicitly labeled in the graph, users can now explore deeper insights through natural language queries. This capability demonstrates how Amazon Q Business transforms static visualizations into queryable knowledge assets, enhancing the value of your existing data visualizations without requiring additional formatting or preparation work.
Best practices for Amazon S3 CDE configuration
When setting up CDE for your Amazon S3 data source, consider these best practices:
- Use conditional rules to only process specific file types that need transformation.
- Monitor Lambda execution with Amazon CloudWatch to track processing errors and performance.
- Set appropriate timeout values for your Lambda functions, especially when processing large files.
- Consider incremental syncing to process only new or modified documents in your S3 bucket.
- Use document attributes to track which documents have been processed by CDE.
Cleanup
Complete the following steps to clean up your resources:
- Go to the Amazon Q Business application and select Remove and unsubscribe for users and groups.
- Delete the Amazon Q Business application.
- Delete the Lambda function.
- Empty and delete the S3 bucket. For instructions, refer to Deleting a general purpose bucket.
Conclusion
This solution demonstrates how combining Amazon Q Business, custom document enrichment, and Amazon Bedrock can transform static visualizations into queryable knowledge assets, significantly enhancing the value of existing data visualizations without additional formatting work. By using these powerful AWS services together, organizations can bridge the gap between visual information and actionable insights, enabling users to interact with different file types in more intuitive ways.
Explore What is Amazon Q Business? and Getting started with Amazon Bedrock in the documentation to implement this solution for your specific use cases and unlock the potential of your visual data.
About the Authors
About the authors
Amit Chaudhary Amit Chaudhary is a Senior Solutions Architect at Amazon Web Services. His focus area is AI/ML, and he helps customers with generative AI, large language models, and prompt engineering. Outside of work, Amit enjoys spending time with his family.
Nikhil Jha Nikhil Jha is a Senior Technical Account Manager at Amazon Web Services. His focus areas include AI/ML, building Generative AI resources, and analytics. In his spare time, he enjoys exploring the outdoors with his family.