In the generative AI era, agents that simulate human actions and behaviors are emerging as a powerful tool for enterprises to create production-ready applications. Agents can interact with users, perform tasks, and exhibit decision-making abilities, mimicking humanlike intelligence. By combining agents with foundation models (FMs) from the Amazon Titan in Amazon Bedrock family, customers can develop multimodal, complex applications that enable the agent to understand and generate natural language or images.
For example, in the fashion retail industry, an assistant powered by agents and multimodal models can provide customers with a personalized and immersive experience. The assistant can engage in natural language conversations, understanding the customer’s preferences and intents. It can then use the multimodal capabilities to analyze images of clothing items and make recommendations based on the customer’s input. Additionally, the agent can generate visual aids, such as outfit suggestions, enhancing the overall customer experience.
In this post, we implement a fashion assistant agent using Amazon Bedrock Agents and the Amazon Titan family models. The fashion assistant provides a personalized, multimodal conversational experience. Among others, the capabilities of Amazon Titan Image Generator to inpaint and outpaint images can be used to generate fashion inspirations and edit user photos. Amazon Titan Multimodal Embeddings models can be used to search for a style on a database using both a prompt text or a reference image provided by the user to find similar styles. Anthropic Claude 3 Sonnet is used by the agent to orchestrate the agent’s actions, for example, search for the current weather to receive weather-appropriate outfit recommendations. A simple web UI through Streamlit provides the user with the best experience to interact with the agent.
The fashion assistant agent can be smoothly integrated into existing ecommerce platforms or mobile applications, providing customers with a seamless and delightful experience. Customers can upload their own images, describe their desired style, or even provide a reference image, and the agent will generate personalized recommendations and visual inspirations.
The code used in this solution is available in the GitHub repository.
Solution overview
The fashion assistant agent uses the power of Amazon Titan models and Amazon Bedrock Agents to provide users with a comprehensive set of style-related functionalities:
- Image-to-image or text-to-image search – This tool allows customers to find products similar to styles they like from the catalog, enhancing their user experience. We use the Titan Multimodal Embeddings model to embed each product image and store them in Amazon OpenSearch Serverless for future retrieval.
- Text-to-image generation – If the desired style is not available in the database, this tool generates unique, customized images based on the user’s query, enabling the creation of personalized styles.
- Weather API connection – By fetching weather information for a given location mentioned in the user’s prompt, the agent can suggest appropriate styles for the occasion, making sure the customer is dressed for the weather.
- Outpainting – Users can upload an image and request to change the background, allowing them to visualize their preferred styles in different settings.
- Inpainting – This tool enables users to modify specific clothing items in an uploaded image, such as changing the design or color, while keeping the background intact.
The following flow chart illustrates the decision-making process:
And the corresponding architecture diagram:
Prerequisites
To set up the fashion assistant agent, make sure you have the following:
- An active AWS account and AWS Identity and Access Management (IAM) role with Amazon Bedrock, AWS Lambda, and Amazon Simple Storage (Amazon S3) access
- Installation of required Python libraries such as Streamlit
- Anthropic Claude 3 Sonnet, Amazon Titan Image Generator and Amazon Titan Multimodal Embeddings models enabled in Amazon Bedrock. You can confirm these are enabled on the Model access page of the Amazon Bedrock console. If these models are enabled, the access status will show as Access granted, as shown in the following screenshot.
Before executing the notebook provided in the GitHub repo to start building the infrastructure, make sure your AWS account has permission to:
- Create managed IAM roles and policies
- Create and invoke Lambda functions
- Create, read from, and write to S3 buckets
- Access and manage Amazon Bedrock agents and models
If you want to enable the image-to-image or text-to-image search capabilities, additional permissions for your AWS account are required:
- Create security policy, access policy, collect, index, and index mapping on OpenSearch Serverless
- Call the
BatchGetCollection
on OpenSearch Serverless
Set up the fashion assistant agent
To set up the fashion assistant agent, follow these steps:
- Clone the GitHub repository using the command
- Complete the prerequisites to grant sufficient permissions
- Follow the deployment steps outlined in the README.md
- (Optional) If you want to use the
image_lookup
feature, execute code snippets inopensearch_ingest.ipynb
to use Amazon Titan Multimodal Embeddings to embed and store sample images - Run the Streamlit UI to interact with the agent using the command
By following these steps, you can create a powerful and engaging fashion assistant agent that combines the capabilities of Amazon Titan models with the automation and decision-making capabilities of Amazon Bedrock Agents.
Test the fashion assistant
After the fashion assistant is set up, you can interact with it through the Streamlit UI. Follow these steps:
- Navigate to your Streamlit UI, as shown in the following screenshot
- Upload an image or enter a text prompt describing the desired style, according to the desired action, for example, image search, image generation, outpainting, or inpainting. The following screenshot shows an example prompt.
- Press enter to send the prompt to the agent. You can view the chain-of-thought (CoT) process of the agent in the UI, as shown in the following screenshot
- When the response is ready, you can view the agent’s response in the UI, as shown in the following screenshot. The response may include generated images, similar style recommendations, or modified images based on your request. You can download the generated images directly from the UI or check the image in your S3 bucket.
Clean up
To avoid unnecessary costs, make sure to delete the resources used in this solution. You can do this by running the following command.
Conclusion
The fashion assistant agent, powered by Amazon Titan models and Amazon Bedrock Agents, is an example of how retailers can create innovative applications that enhance the customer experience and drive business growth. By using this solution, retailers can gain a competitive edge, offering personalized style recommendations, visual inspirations, and interactive fashion advice to their customers.
We encourage you to explore the potential of building more agents like this fashion assistant by checking out the examples available on the aws-samples GitHub repository.
About the Authors