As more businesses increase their online presence to better serve their customers, new fraud patterns are continually emerging. In today’s ever-evolving digital landscape, where fraudsters are increasingly sophisticated in their tactics, detecting and preventing such fraudulent activities has become paramount for businesses and financial institutions.
Traditional rule-based fraud detection systems are limited in their ability to iterate quickly, as they rely on pre-defined rules and thresholds to flag potentially fraudulent activity. These systems can generate a large number of false positives, significantly increasing the volume of manual investigations performed by the fraud team. In addition, humans are also error-prone and have a limited ability to process large amounts of data, making manual efforts to detect fraud time-consuming, which can lead to lost fraudulent transactions, increased losses, and reputational damage.
Machine learning (ML) plays a crucial role in fraud detection because it can quickly and accurately analyze large volumes of data to identify anomalous patterns and potential fraud trends. The performance of the ML fraud model is highly dependent on the quality of the data it is trained on, and specifically for supervised models, accurate labeled data is crucial. In ML, the lack of meaningful historical data to train a model is called cold start problem.
In the world of fraud detection, the following are some of the traditional cold boot scenarios:
- Build an accurate fraud model without transaction history or fraud cases
- Be able to accurately distinguish legitimate activity from fraud for new customers and accounts
- Risk decision payments to an address or payee never before seen by the fraud system
There are several ways to resolve these scenarios. For example, you can use generic models, known as one-size-fits-all models, which are typically trained in addition to fraud data sharing platforms such as fraud consortia. The challenge with this approach is that no company is the same and fraud attack vectors are constantly changing.
Another option is to use an unsupervised anomaly detection model to monitor and detect unusual behavior among client events. The challenge with this approach is that not all fraud events are anomalies, and not all anomalies are actually fraud. So you can expect higher false positive rates.
In this post, we show you how you can quickly start a real-time fraud prevention ML model with as few as 100 events using Amazon’s new Fraud Detector feature, Cold Start, dramatically lowering the barrier to entry to custom ML models for many organizations. who simply do not have the time or ability to accurately collect and label large data sets. In addition, we discuss how by using Amazon Fraud Detector stored events, you can review the results and correctly label events to retrain your models, thereby improving the effectiveness of your fraud prevention measures in over time
Solution overview
Amazon Fraud Detector is a fully managed fraud detection service that automates the detection of potentially fraudulent online activity. You can use Amazon Fraud Detector to build custom fraud detection models with your own historical data set, add decision logic using the built-in rules engine, and orchestrate risk decision workflows with the click of a button.
Previously, more than 10,000 labeled events with at least 400 examples of fraud had to be provided to form a model. With the launch of the cold start feature, you can quickly train a model with a minimum of 100 events and at least 50 classified as fraud. Compared to the initial data requirements, this represents a 99% reduction in historical data and an 87% reduction in label requirements.
The new Cold Start feature provides intelligent methods for enriching, scaling, and risk modeling small datasets. In addition, Amazon Fraud Detector performs tag assignments and sampling for untagged events.
Experiments with public datasets show that by lowering the limits to 50 frauds and only 100 events, you can build fraud ML models that consistently outperform unsupervised and semi-supervised models.
Cold boot model performance
The ability of an ML model to generalize and make accurate predictions on unseen data is affected by the quality and diversity of the training dataset. For Cold Start models, this is no different. You should have processes in place as more data is collected to correctly label these events and retrain the models, ultimately leading to optimal model performance.
With a lower data requirement, the instability of the reported performance increases due to the increased model variance and limited test data size. To help you build the correct expectation of model performance, in addition to model AUC, Amazon Fraud Detector also reports uncertainty interval metrics. The following table defines these metrics.
. | . | AUC | ||
. | . | < 0.6 | 0.6 – 0.8 | >= 0.8 |
AUC Uncertainty Range | > 0.3 | Model performance is very low and can vary widely. Expect poor fraud detection performance. | Model performance is low and can vary widely. Expect limited fraud detection performance. | Model performance may vary widely. |
0.1 – 0.3 | Model performance is very low and can vary significantly. Expect poor fraud detection performance. | Model performance is low and can vary significantly. Expect limited fraud detection performance. | Model performance may vary significantly. | |
< 0.1 | The performance of the model is very low. Expect poor fraud detection performance. | Model performance is low. Expect limited fraud detection performance. | without notice |
Train a cold boot model
Training a cold start fraud model is identical to training any other Amazon fraud detector model; what differs is the size of the data set. Example datasets for Cold Start training can be found in our GitHub repository. To train a custom Amazon Fraud Detector model, you can follow our hands-on tutorial. You can use the Amazon Fraud Detector console tutorial or the SDK tutorial to build, train, and deploy a fraud detection model.
After training your model, you can review the performance metrics and then deploy it by changing its state active. For more information about model scores and performance metrics, see Model Scores and Model Performance Metrics. At this point, you can now add your model to your detector, add business rules to interpret the risk scores generated by the model, and make real-time predictions using the GetEventPrediction API.
Continuous improvement of ML fraud model and feedback loop
With Amazon Fraud Detector’s cold start feature, you can quickly launch a fraud detector endpoint and start protecting your businesses right away. However, new patterns of fraud continually emerge, so it is critical to retrain Cold Start models with newer data to improve the accuracy and effectiveness of predictions over time.
To help you iterate on your models, Amazon Fraud Detector automatically stores all events sent to the inference service. You can change or validate that the event ingestion flag is enabled at the event type level, as shown in the screenshot below.
With the Stored Events feature, you can use the Amazon Fraud Detector SDK to programmatically access an event, review the event’s metadata and prediction explanation, and make an informed risk decision. Additionally, you can tag the event for future model recycling and continuous model improvement. The diagram below shows an example of this workflow.
In the following code snippets, we demonstrate the process for tagging a stored event:
- To make a real-time fraud prediction on an event, call the GetEventPrediction API:
As seen in the response, depending on the matching decision engine rule, the event should be sent for manual review by the fraud team. By gathering the prediction explanation metadata, you can learn how each event variable affected the model’s fraud prediction score.
- To collect these ideas, we use the
get_event_prediction_metada
API:
API response:
With this knowledge, the fraud analyst can make an informed risk decision about the event in question and update the event label.
- To update the event tag, call the
update_event_label
API:
API response
As a final step, you can verify that the event tag has been successfully updated.
- To verify the event tag, call the
get_event
API:
API response
Clean up
To avoid incurring future charges, please delete the resources created for the solution.
conclusion
This post demonstrated how you can quickly start a real-time fraud prevention system with around 100 events using Amazon Fraud Detector’s new Cold Start feature. We discussed how you can use stored events to review results and correctly label events and retrain your models, improving the effectiveness of your fraud prevention measures over time.
Fully managed AWS services, such as Amazon Fraud Detector, help reduce the machine learning cold start challenge in fraud detection using Amazon Fraud Detector”>time companies spend analyzing user behavior to identify fraud on their platforms and focus more on driving business value. To learn more about how Amazon Fraud Detector can help your business, visit Amazon Fraud Detector.
About the Authors
Marcel Pividal is a global architect of AI services solutions in the worldwide organization of specialists. Marcel has over 20 years of experience solving business problems through technology for FinTech, payment providers, pharma and government agencies. Its current focus areas are risk management, fraud prevention and identity verification.
Julia Xu is a Research Scientist with Amazon Fraud Detector. He is passionate about solving customer challenges using machine learning techniques. In her free time, she enjoys hiking, painting, and exploring new cafes.
Guilherme Ricci is a senior solutions architect at AWS, helping startups modernize and cost-optimize their applications. With more than 10 years of experience with companies in the financial sector, he currently works together with the team of AI/ML specialists.
Source link
The machine learning cold start challenge in fraud detection is a difficult problem to overcome, but with the help of advanced technology, such as Amazon Fraud Detector, solutions are now accessible. Companies are now able to effectively detect fraud and protect themselves from financial losses. By leveraging solutions such as Amazon Fraud Detector, Ikaroa is able to quickly and accurately detect fraud in real time.
Amazon Fraud Detector is built on sophisticated machine learning algorithms that can detect fraud in a wide range of applications. These algorithms evaluate data points, such as device information, customer preferences, or behavioral characteristics to make predictions about potential fraudulent activity. Amazon Fraud Detector offers scalable and cost-effective detection solutions that can monitor large datasets and detect fraud before it affects your business.
With Amazon Fraud Detector, Ikaroa is able to leverage the scalability and accuracy of machine learning to detect fraud patterns and stop them in their tracks. The machine learning algorithms powering Amazon Fraud Detector are constantly updated and improved, so that they can detect the newest kinds of fraud. By utilizing Amazon Fraud Detector, Ikaroa is able to effectively and rapidly identify unusual behaviors that may indicate fraudulent activity.
One of the most powerful benefits of using Amazon Fraud Detector is its ability to continuously analyze data and update fraud detection models. This allows Ikaroa to evaluate new data points, trends, or changes in customer behaviors and adapt accordingly to ensure that their fraud detection systems remain as accurate and up-to-date as possible. Amazon Fraud Detector also offers an intuitive user interface, allowing Ikaroa to review and approve changes to their models quickly and efficiently.
Overall, Ikaroa is able to effectively overcome the machine learning cold start challenge in fraud detection leveraging Amazon Fraud Detector. With Amazon Fraud Detector, Ikaroa can monitor large datasets and detect fraud before it affects their business. Amazon Fraud Detector offers scalability, accuracy, and continuous updates, allowing Ikaroa to remain up-to-date and detect frauds quickly and efficiently.