Reinforcement Learning from Human Feedback (RLHF) is recognized as the industry standard technique for ensuring large language models (LLMs) produce content that is truthful, harmless, and helpful. The technique operates by training a “reward model” based on human feedback and uses this model as a reward function to optimize an agent’s policy through reinforcement learning (RL). RLHF has proven to be essential to produce LLMs such as OpenAI’s ChatGPT and Anthropic’s Claude that are aligned with human objectives. Gone are the days when you need unnatural prompt engineering to get base models, such as GPT-3, to solve your tasks.
An important caveat of RLHF is that it is a complex and often unstable procedure. As a method, RLHF requires that you must first train a reward model that reflects human preferences. Then, the LLM must be fine-tuned to maximize the reward model’s estimated reward without drifting too far from the
How United Airlines built a cost-efficient Optical Character Recognition active learning pipeline
In this post, we discuss how United Airlines, in collaboration with the Amazon Machine Learning Solutions Lab, build an active learning framework on AWS to automate the processing of passenger documents.
“In order to deliver the best flying experience for our passengers and make our internal business process as efficient as possible, we have developed an automated machine learning-based document processing pipeline in AWS. In order to power these applications, as well as those using other data modalities like computer vision, we need a robust and efficient workflow to quickly annotate data, train and evaluate models, and iterate quickly. Over the course a couple months, United partnered with the Amazon Machine Learning Solutions Labs to design and develop a reusable, use case-agnostic active learning workflow using AWS CDK. This workflow will be foundational to our unstructured data-based machine learning applications as it will enable us to minimize human labeling
Optimize generative AI workloads for environmental sustainability
The adoption of generative AI is rapidly expanding, reaching an ever-growing number of industries and users worldwide. With the increasing complexity and scale of generative AI models, it is crucial to work towards minimizing their environmental impact. This involves a continuous effort focused on energy reduction and efficiency by achieving the maximum benefit from the resources provisioned and minimizing the total resources required.
To add to our guidance for optimizing deep learning workloads for sustainability on AWS, this post provides recommendations that are specific to generative AI workloads. In particular, we provide practical best practices for different customization scenarios, including training models from scratch, fine-tuning with additional data using full or parameter-efficient techniques, Retrieval Augmented Generation (RAG), and prompt engineering. Although this post primarily focuses on large language models (LLM), we believe most of the recommendations can be extended to other foundation models.
Generative AI problem framing
When framing your
Train and deploy ML models in a multicloud environment using Amazon SageMaker
As customers accelerate their migrations to the cloud and transform their business, some find themselves in situations where they have to manage IT operations in a multicloud environment. For example, you might have acquired a company that was already running on a different cloud provider, or you may have a workload that generates value from unique capabilities provided by AWS. Another example is independent software vendors (ISVs) that make their products and services available in different cloud platforms to benefit their end customers. Or an organization may be operating in a Region where a primary cloud provider is not available, and in order to meet the data sovereignty or data residency requirements, they can use a secondary cloud provider.
In these scenarios, as you start to embrace generative AI, large language models (LLMs) and machine learning (ML) technologies as a core part of your business, you may be looking for
Generative AI and multi-modal agents in AWS: The key to unlocking new value in financial markets
Multi-modal data is a valuable component of the financial industry, encompassing market, economic, customer, news and social media, and risk data. Financial organizations generate, collect, and use this data to gain insights into financial operations, make better decisions, and improve performance. However, there are challenges associated with multi-modal data due to the complexity and lack of standardization in financial systems and data formats and quality, as well as the fragmented and unstructured nature of the data. Financial clients have frequently described the operational overhead of gaining financial insights from multi-modal data, which necessitates complex extraction and transformation logic, leading to bloated effort and costs. Technical challenges with multi-modal data further include the complexity of integrating and modeling different data types, the difficulty of combining data from multiple modalities (text, images, audio, video), and the need for advanced computer science skills and sophisticated analysis tools.
One of the ways to handle
How VirtuSwap accelerates their pandas-based trading simulations with an Amazon SageMaker Studio custom container and AWS GPU instances
This post is written in collaboration with Dima Zadorozhny and Fuad Babaev from VirtuSwap.
VirtuSwap is a startup company developing innovative technology for decentralized exchange of assets on blockchains. VirtuSwap’s technology provides more efficient trading for assets that don’t have a direct pair between them. The absence of a direct pair leads to costly indirect trading, meaning that two or more trades are required to complete a desired swap, leading to double or triple trading costs. VirtuSwap’s Reserve-based Virtual Pools technology solves the problem by making every trade direct, saving up to 50% of trading costs. Read more at virtuswap.io.
In this post, we share how VirtuSwap used the bring-your-own-container feature in Amazon SageMaker Studio to build a robust environment to host their GPU-intensive simulations to solve linear optimization problems.
The challenge
The VirtuSwap Minerva engine creates recommendations for optimal distribution of liquidity between different liquidity pools, while taking into account
Unlock ML insights using the Amazon SageMaker Feature Store Feature Processor
Amazon SageMaker Feature Store provides an end-to-end solution to automate feature engineering for machine learning (ML). For many ML use cases, raw data like log files, sensor readings, or transaction records need to be transformed into meaningful features that are optimized for model training.
Feature quality is critical to ensure a highly accurate ML model. Transforming raw data into features using aggregation, encoding, normalization, and other operations is often needed and can require significant effort. Engineers must manually write custom data preprocessing and aggregation logic in Python or Spark for each use case.
This undifferentiated heavy lifting is cumbersome, repetitive, and error-prone. The SageMaker Feature Store Feature Processor reduces this burden by automatically transforming raw data into aggregated features suitable for batch training ML models. It lets engineers provide simple data transformation functions, then handles running them at scale on Spark and managing the underlying infrastructure. This enables data scientists
Orchestrate Ray-based machine learning workflows using Amazon SageMaker
Machine learning (ML) is becoming increasingly complex as customers try to solve more and more challenging problems. This complexity often leads to the need for distributed ML, where multiple machines are used to train a single model. Although this enables parallelization of tasks across multiple nodes, leading to accelerated training times, enhanced scalability, and improved performance, there are significant challenges in effectively using distributed hardware. Data scientists have to address challenges like data partitioning, load balancing, fault tolerance, and scalability. ML engineers must handle parallelization, scheduling, faults, and retries manually, requiring complex infrastructure code.
In this post, we discuss the benefits of using Ray and Amazon SageMaker for distributed ML, and provide a step-by-step guide on how to use these frameworks to build and deploy a scalable ML workflow.
Ray, an open-source distributed computing framework, provides a flexible framework for distributed training and serving of ML models. It abstracts away
Designing resilient cities at Arup using Amazon SageMaker geospatial capabilities
This post is co-authored with Richard Alexander and Mark Hallows from Arup.
Arup is a global collective of designers, consultants, and experts dedicated to sustainable development. Data underpins Arup consultancy for clients with world-class collection and analysis providing insight to make an impact.
The solution presented here is to direct decision-making processes for resilient city design. Informing design decisions towards more sustainable choices reduces the overall urban heat islands (UHI) effect and improves quality of life metrics for air quality, water quality, urban acoustics, biodiversity, and thermal comfort. Identifying key areas within an urban environment for intervention allows Arup to provide the best guidance in the industry and create better quality of life for citizens around the planet.
Urban heat islands describe the effect urban areas have on temperature compared to surrounding rural environments. Understanding how UHI affects our cities leads to improved designs that reduce the impact of urban
Learn how to build and deploy tool-using LLM agents using AWS SageMaker JumpStart Foundation Models
Large language model (LLM) agents are programs that extend the capabilities of standalone LLMs with 1) access to external tools (APIs, functions, webhooks, plugins, and so on), and 2) the ability to plan and execute tasks in a self-directed fashion. Often, LLMs need to interact with other software, databases, or APIs to accomplish complex tasks. For example, an administrative chatbot that schedules meetings would require access to employees’ calendars and email. With access to tools, LLM agents can become more powerful—at the cost of additional complexity.
In this post, we introduce LLM agents and demonstrate how to build and deploy an e-commerce LLM agent using Amazon SageMaker JumpStart and AWS Lambda. The agent will use tools to provide new capabilities, such as answering questions about returns (“Is my return rtn001 processed?”) and providing updates about orders (“Could you tell me if order 123456 has shipped?”). These new capabilities require LLMs
Build a classification pipeline with Amazon Comprehend custom classification (Part I)
“Data locked away in text, audio, social media, and other unstructured sources can be a competitive advantage for firms that figure out how to use it“
Only 18% of organizations in a 2019 survey by Deloitte reported being able to take advantage of unstructured data. The majority of data, between 80% and 90%, is unstructured data. That is a big untapped resource that has the potential to give businesses a competitive edge if they can find out how to use it. It can be difficult to find insights from this data, particularly if efforts are needed to classify, tag, or label it. Amazon Comprehend custom classification can be useful in this situation. Amazon Comprehend is a natural-language processing (NLP) service that uses machine learning to uncover valuable insights and connections in text.
Document categorization or classification has significant benefits across business domains –
Improved search and retrieval –
Fine-tune Falcon 7B and other LLMs on Amazon SageMaker with @remote decorator
Today, generative AI models cover a variety of tasks from text summarization, Q&A, and image and video generation. To improve the quality of output, approaches like n-short learning, Prompt engineering, Retrieval Augmented Generation (RAG) and fine tuning are used. Fine-tuning allows you to adjust these generative AI models to achieve improved performance on your domain-specific tasks.
With Amazon SageMaker, now you can run a SageMaker training job simply by annotating your Python code with @remote decorator. The SageMaker Python SDK automatically translates your existing workspace environment, and any associated data processing code and datasets, into an SageMaker training job that runs on the training platform. This has the advantage of writing the code in a more natural, object-oriented way, and still uses SageMaker capabilities to run training jobs on a remote cluster with minimal changes.
In this post, we showcase how to fine-tune a Falcon-7B Foundation Models (FM) using @remote
Simplify access to internal information using Retrieval Augmented Generation and LangChain Agents
This post takes you through the most common challenges that customers face when searching internal documents, and gives you concrete guidance on how AWS services can be used to create a generative AI conversational bot that makes internal information more useful.
Unstructured data accounts for 80% of all the data found within organizations, consisting of repositories of manuals, PDFs, FAQs, emails, and other documents that grows daily. Businesses today rely on continuously growing repositories of internal information, and problems arise when the amount of unstructured data becomes unmanageable. Often, users find themselves reading and checking many different internal sources to find the answers they need.
Internal question and answer forums can help users get highly specific answers but also require longer wait times. In the case of company-specific internal FAQs, long wait times result in lower employee productivity. Question and answer forums are difficult to scale as they rely on
Visualize an Amazon Comprehend analysis with a word cloud in Amazon QuickSight
Searching for insights in a repository of free-form text documents can be like finding a needle in a haystack. A traditional approach might be to use word counting or other basic analysis to parse documents, but with the power of Amazon AI and machine learning (ML) tools, we can gather deeper understanding of the content.
Amazon Comprehend is a fully, managed service that uses natural language processing (NLP) to extract insights about the content of documents. Amazon Comprehend develops insights by recognizing the entities, key phrases, sentiment, themes, and custom elements in a document. Amazon Comprehend can create new insights based on understanding the document structure and entity relationships. For example, with Amazon Comprehend, you can scan an entire document repository for key phrases.
Amazon Comprehend lets non-ML experts easily do tasks that normally take hours of time. Amazon Comprehend eliminates much of the time needed to clean, build, and
Amazon SageMaker simplifies the Amazon SageMaker Studio setup for individual users
Today, we are excited to announce the simplified Quick setup experience in Amazon SageMaker. With this new capability, individual users can launch Amazon SageMaker Studio with default presets in minutes.
SageMaker Studio is an integrated development environment (IDE) for machine learning (ML). ML practitioners can perform all ML development steps—from preparing their data to building, training, and deploying ML models—within a single, integrated visual interface. You also get access to a large collection of models and pre-built solutions that you can deploy with a few clicks.
To use SageMaker Studio or other personal apps such as Amazon SageMaker Canvas, or to collaborate in shared spaces, AWS customers need to first set up a SageMaker domain. A SageMaker domain consists of an associated Amazon Elastic File System (Amazon EFS) volume, a list of authorized users, and a variety of security, application, policy, and Amazon Virtual Private Cloud (Amazon VPC) configurations. When
Unlocking language barriers: Translate application logs with Amazon Translate for seamless support
Application logs are an essential piece of information that provides crucial insights into the inner workings of an application. This includes valuable information such as events, errors, and user interactions that would aid an application developer or an operations support engineer to debug and provide support. However, when these logs are presented in languages other than English, it creates a significant hurdle for developers who can’t read the content, and hinders the support team’s ability to identify and address issues promptly.
In this post, we explore a solution on how you can unlock language barriers using Amazon Translate, a fully managed neural machine translation service for translating text to and from English across a wide range of supported languages. The solution will complement your existing logging workflows by automatically translating all your applications logs in Amazon CloudWatch in real time, which can alleviate the challenges posed by non-English application logs.
Accelerate client success management through email classification with Hugging Face on Amazon SageMaker
This is a guest post from Scalable Capital, a leading FinTech in Europe that offers digital wealth management and a brokerage platform with a trading flat rate.
As a fast-growing company, Scalable Capital’s goals are to not only build an innovative, robust, and reliable infrastructure, but to also provide the best experiences for our clients, especially when it comes to client services.
Scalable receives hundreds of email inquiries from our clients on a daily basis. By implementing a modern natural language processing (NLP) model, the response process has been shaped much more efficiently, and waiting time for clients has been reduced tremendously. The machine learning (ML) model classifies new incoming customer requests as soon as they arrive and redirects them to predefined queues, which allows our dedicated client success agents to focus on the contents of the emails according to their skills and provide appropriate responses.
In this post, we
Falcon 180B foundation model from TII is now available via Amazon SageMaker JumpStart
Today, we are excited to announce that the Falcon 180B foundation model developed by Technology Innovation Institute (TII) and trained on Amazon SageMaker is available for customers through Amazon SageMaker JumpStart to deploy with one-click for running inference. With a 180-billion-parameter size and trained on a massive 3.5-trillion-token dataset, Falcon 180B is the largest and one of the most performant models with openly accessible weights. You can try out this model with SageMaker JumpStart, a machine learning (ML) hub that provides access to algorithms, models, and ML solutions so you can quickly get started with ML. In this post, we walk through how to discover and deploy the Falcon 180B model via SageMaker JumpStart.
What is Falcon 180B
Falcon 180B is a model released by TII that follows previous releases in the Falcon family. It’s a scaled-up version of Falcon 40B, and it uses multi-query attention for better scalability. It’s
Amazon SageMaker Domain in VPC only mode to support SageMaker Studio with auto shutdown Lifecycle Configuration and SageMaker Canvas with Terraform
Amazon SageMaker Domain supports SageMaker machine learning (ML) environments, including SageMaker Studio and SageMaker Canvas. SageMaker Studio is a fully integrated development environment (IDE) that provides a single web-based visual interface where you can access purpose-built tools to perform all ML development steps, from preparing data to building, training, and deploying your ML models, improving data science team productivity by up to 10x. SageMaker Canvas expands access to machine learning by providing business analysts with a visual interface that allows them to generate accurate ML predictions on their own—without requiring any ML experience or having to write a single line of code.
HashiCorp Terraform is an infrastructure as code (IaC) tool that lets you organize your infrastructure in reusable code modules. AWS customers rely on IaC to design, develop, and manage their cloud infrastructure, such as SageMaker Domains. IaC ensures that customer infrastructure and services are consistent, scalable, and reproducible
Implement smart document search index with Amazon Textract and Amazon OpenSearch
For modern companies that deal with enormous volumes of documents such as contracts, invoices, resumes, and reports, efficiently processing and retrieving pertinent data is critical to maintaining a competitive edge. However, traditional methods of storing and searching for documents can be time-consuming and often result in a large effort to find a specific document, especially when they include handwriting. What if there was a way to process documents intelligently and make them searchable in with high accuracy?
This is made possible with Amazon Textract, AWS’s Intelligent Document Processing service, coupled with the fast search capabilities of OpenSearch. In this post, we’ll take you on a journey to rapidly build and deploy a document search indexing solution that helps your organization to better harness and extract insights from documents.
Whether you’re in Human Resources looking for specific clauses in employee contracts, or a financial analyst sifting through a mountain of invoices