For some time we at Megaputer have been developing an exciting application of both natural language processing and machine learning. But before we discuss how a solution the task of identifying subrogation opportunities works, let’s first sufficiently identify the problem in its domain.
Insurers initially fulfill claims regardless of fault
Consider an insurance company. Our solution deals with auto insurance and it is helpful to use auto insurance to learn the basic concepts of the problem. Note, however, that these applications can be easily generalized to any other types of insurance.
With auto insurance, it is just a fact of life that soon an accident will occur and the insurer will need to make coverage payments on the behalf of the insured. However, it is also a fact that very frequently that insured party was not liable for the damages incurred. For example, the other party involved in the accident may have ran a red light or may have been cited with a DUI. In these situations, the insured may choose to pursue legal action against the liable party; but this rarely happens. The insured already was covered by their insurance and it would cost substantially to fight the court battle necessary not to mention that they might not even succeed in the end. This leaves insurance companies rather displeased. Another party caused the accident and may be legally liable for the damages yet they ended up footing the bill!
The basic subrogation process
Fortunately a legal process called subrogation allows an insurer to pursue damages against the liable party in the stead of the insured party. Thus, a sequence of events can now occur:
- An accident happens,
- the insurance company pays for damages,
- the insurance company reviews evidence and realizes that the other party was responsible,
- then the insurance company decides to act on this through subrogation in order to recover their loss from the other party (or, rather, the other party’s provider).
The difficulty of subrogation
This is essentially the subrogation process. It is – on a macro level – simple. Complications arise when we zoom in to see how it works in practice.
For instance, how does an insurance company assess that the other party was liable? It may have access to police reports and witness statements but these need to be read by someone who can then take all of the facts into account to decide not only how liable the parties are under the law but how likely would they succeed if they actually took action on the claim and tried to recover the payments. This can be a time consuming process.
On top of that, information is not available at the same time. It may be awhile before access to certain records is made available so once this new data is obtained the analysis will have to be redone with this new information taken into account and this may occur several times as batches come in.
Finally, this repetitive and time consuming task needs to be performed on an overwhelming number of claims, which can drown an insurance company if they are not diligent in keeping ahead of the analysis. And so this is the heart of the problem behind subrogation opportunity identification as it is commonly done:
It is a manual, time consuming task that is subject to the biases of the humans performing it leading to inconsistency or simply missed opportunities.
Analyzing claims data to identify subrogation opportunities
Now that we understand the problem let’s discuss a solution!
We have developed a system, PolyAnalyst, to ingest unstructured text records from insurance companies, which are a compilation of available information such as the police reports and witness statements to a car accident.
Next, these records are cleansed. The later steps require more useful data, but real world data often is full of data entry errors, acronyms, and other complexities of natural language and data processing. For example, we can automatically correct common spelling errors found in the claim text.
Next, with the power of PolyAnalyst’s natural language processing tools, we automate the extraction of pieces of information important for making a subrogation decision on the claim. What traffic lights or signs did each party have? Was someone cited by the police and if so, what for? How did the collision occur? These and many other questions are asked and our automated information extraction system reads the text to find the answers.
Once key facts are mined from the text, we use machine learning to create models for assessing whether a given claim is a good subrogation opportunity or not. Using thousands of historical claims whose subrogation status has been decided by humans, we have trained multiple models using advanced machine learning techniques such as decision trees, support vector machines, and neural networks. The models trained on our historical data can be used for future assessments of new claims.
The benefits of computer-assisted subrogation
With a fully automated and central system, processing claims for subrogation opportunities can be done in a consistent manner in almost no time. The sooner an opportunity is identified the sooner the company can recover the funds it is entitled to. This solution can be utilized in a number of ways beyond simply replacing humans in this task. It can be used as a first pass system that quickly sorts claims based on how likely they are to be opportunities. For instance, claims discussing matters like a rock hitting the windshield would never be subrogation opportunities and they can be quickly read and sorted out by the machine so that a human does not waste time with them and can more appropriately spend they expertise on examining complicated claims. Or the system could be used as a quick check on the work of the humans. Claims that are rejected by human analysts can be subsequently fed to the system and rapidly and inexpensively checked once more to make sure no claims with good subrogation potential are missed. If the system thinks it found an opportunity, it can alert a human and make sure the insurance company performs the recovery on all claims that lend themselves to subrogation.
Ultimately this system highlights the power of joint application of advanced text analysis and predictive modeling. This particular application was built with subrogation in mind, but the process of reading text, extracting information, and then processing that information and making decisions with machine learning is an extremely universal concept that can be applied in any domain.
We hope that this short synopsis of our solution has been interesting or has sparked ideas on how text analysis and machine learning could be useful in your own work.