By Sara Bondell - June 22, 2020
Prior to the 1960s, all medical records were handwritten and kept on paper, manually filed on shelves. The creation of the electronic medical record (EMR) drastically changed how medical records were recorded and stored.
The information in EMRs is critical for cancer care, however there is no easy way to extract it. Historical use of a type of artificial intelligence called natural language processing (NLP) helped, but cancer care teams were often forced to comb through records by hand to make treatment decisions.
In late 2018, a new powerful deep-learning NLP algorithm was published called Bidirectional Encoder Representations from Transformers, or BERT. For the first time, NLP achieved human-level performance across multiple benchmark tasks. Researchers at Moffitt Cancer Center wondered if BERT could be trained to extract clinically relevant data from pathology reports.
Moffitt’s Certified Tumor Registrars (CTRs) collect and report data about cancer patients to state and federal agencies. While doing this, they label important information in each pathology report. Researchers used this data to train BERT to answer two predetermined questions from a pathology report: “What organ contains the tumor?” and “What is the kind of tumor or carcinoma?”
Training BERT required close collaboration between a diverse, interdisciplinary team that included researchers from Moffitt’s Cancer Registry, Data Quality, Health Informatics, Biostatistics & Bioinformatics, and AI teams.
After training BERT on 8,200 pathology reports, the researchers assessed the accuracy of the question-and-answer program on 2,050 new reports. The results, presented at the American Association for Cancer Research Annual Meeting, show BERT could determine the histology of a tumor with 96.7% accuracy and the tumor site with 92.9% accuracy.
The team is now working to improve the performance and training of BERT using nearly 500,000 pathology reports. The goal is to use the deep-leaning program to extract even more information from health records to better facilitate personalized medicine.
“A patient can come in and literally within minutes we could search their pathology reports and then we can start suggesting clinical trials they may be suitable for rather than relying on someone to go and look,” said Dr. Ross Mitchell, Moffitt’s Artificial Intelligence Officer. “If we get the criteria for the clinical trial and we can extract this, then we are a huge step towards automating finding patients their clinical trials.”