Pilot study on the use of IBM Watson Content Analytics with free-text clinical reports

Project description

A significant proportion of clinical data is stored as unstructured free-text reports, such as discharge summaries or radiology reports, which makes them difficult to process and analyse on a large scale. Text analytics methods like document retrieval and information extraction can address this challenge. I have conducted a three-month pilot study on using IBM Watson Content Analytics to identify relevant documents in large-scale collections of clinical reports (~6.5 million documents in total). My task was to retrieve documents which contain positive instances of certain conditions (e.g. “mild hydronephrosis is noted” as a positive instance, but “no evidence of hydronephrosis” as a negative instance). The custom rule-based models built using IBM Watson Content Analytics have achieved very good results for this task.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s