Monday, July 10, 2017
How PointPredictive Is Using Machine Learning To Uncover Fraud, With Tim Grace
There's been a revolution in the use of artificial intelligence and machine learning in the last few years, by both startups and large companies, to help in a large number of areas. Tim Grace, who co-founded San Diego-based PointPredictive with Frank McKenna and Joe Jackson, is a startup veteran, who sold BasePoint Analytics to CoreLogic in 2009, and is a former executive from HNC Software. He sat down with us to tell us how PointPredictive is harnessing artificial intelligence and machine learning, and applying it to uncovering fraud in auto loan applications--and how that can save millions of dollars for lenders. It's not the first time Grace and his co-founders have applied this technique to an area of interest to the finance industry—they did it before for credit cards when execs at HNC Software, and for mortgages at their last startup, BasePoint Analytics. We talked with Tim about how they're now applying this to a new market.
What is PointPredictive?
Tim Grace: PointPredictive is an artificial intelligence, machine learning, and pattern recognition provider. If you are familiar with your FICO credit score, we do the same kind of scoring for fraud. With a FICO score, you are looking at a consumer's likelihood of paying back a loan using a credit score, what we provide is a measure of if an application's information is true and accurate. The way we think about it, is if you are going to pull a credit score, it makes sense to also pull a fraud score. If the information you are relying upon for a credit score is false, then it makes that credit score almost not usable. You want to make sure that the income they are providing, the occupation, the years they've been in that occupation, is all correct. With each company that we've started, we've concentrated on a particular market. PointPredictive is focused on the automotive lending market. What we studied and surveyed with all of our lenders, is that right now that market is a $4 to $6 billion dollar industry, with lots of fraud from mis-representation. What we typically see when we implement or test our scores against applications they have previously funded and taken a loss on, is that we could have reduced net chargeoffs from 40 to 60 percent. For larger lenders, that might represent $40M to $50M in carveout of losses that they can reduce.
You've done this before in other industries--what led to new company?
Tim Grace: We all get our started at HNC Software in San Diego. They were the first company to bring artificial intelligence and machine learning scoring to the commercial market. Before that, it had just been used in the defense industry. That particular company had been applying this kind of software to credit card fraud. I spent a good portion of my career there. We went public, got to 85 percent market share with its software, and from there, Frank and I applied that same technology to the mortgage space, when we started another company, BasePoint Analytics. We had that company for about 5 years, and got great penetration into mortgage fraud applications. We sold that to CoreLogic in 2009, and that company now has roughly a 60 percent market share of mortgage applications. We were looking for another fraud problem to solve, and we'd already checked the box for credit cards, checked the box for mortgages, and saw that automotive lending had no comprehensive fraud score for applications. When we looked at their loss numbers, we knew the type of statistical analysis we were doing could solve that problem, too, which is why we started the company.
AI and machine learning has been around for a long time, but only now does it seem like there have been a lot of companies taking advantage of AI and machine learning in the market. Why is that?
Tim Grace: I think there are a couple of reasons. One, is because, in order to make AI and machine learning effective, you have to have models which are very targeted with high detection rates, but low false positives. To do that, you need quite a bit of data. What's happened over the last fifteen years, is our ability to capture, store, and utilize data has changed dramatically. When I was at HNC Software, the model was, we need a bigger hard drive to store data. We'd go out and ask for a terabyte of storage, and it would cost us hundreds of thousands of dollars. Nowadays, you can easily get a terabyte on your laptop. I think that ability to capture and store and make available that kind of data to data scientists, has really changed things dramatically. That's one of the bigger reasons. I also think at the education level, people have seen that machine learning and AI has a much better accuracy rate than some of the other, rules-based systems of the past. We've seen that in credit cards, in telecom, and thanks to BasePoint, mortgages. Now we're doing that for automotive. I think as we continue to demonstrate the accuracy of those techniques, and savings from AI and machine learning, people really want to take advantage of that technology.
Explain a little why machine learning is so much better than the rules-based systems used before?
Tim Grace: What we've seen with rules-based systems, is you get a very high false positive. You get alerted to thinks, so you need to review things as a very high rate. The actual ones you find have a misrepresentation rate which is very low. When you look at the metrics, in some cases, the rules-based systems will have 200 to 300 to 1 false positive rates. That means you have to look at 300 apps to find one which actually was misrepresentation. The reason for this, is there are lots of reasons something might be flagged as a misrepresentation. For example, it could be a simple typo, or it could be that I just didn't know that they needed my gross income, and actually put down my net income instead. So, when you look at some of this that only consider one field or character, it gives you lots of errors you have to clear. With machine learning, we take all of these things into consideration together before generating a score. It will look at what the income norms are for a particular zip code, how far outside of the norm is that income. If it's really outside the norm, it may increase the score a bit. Only, that isn't going to be enough. There have to be multiple instances where things don't look normal for a score to increase. By looking at a holistic view of hundreds of features and comparing that to the norm, it has a much higher accuracy than using just that one field of information in a solution.
Finally, what's next for you?
Tim Grace: We were quite busy even before the funding. We have 46 lenders that we are in converation with, and we're very active in our sales process, with twelve live pilots scheduled for the next sixty days. We are very active in 16 of the top 25 lenders in the United States, and what we are doing now, is adding people to support the volume of customers we have.
Thanks, and good luck!