Highest Rated Comments
nlpster3 karma
There has been a lot of interest in work on structured data (eg lab tests, clinical codes, patient meta-data), lots of progress on imaging data, but comparatively little work & progress on medical free text. Given so much information is is stored in free text, why do you think this is, and is it a problem for advancing clinical ML?
nlpster2 karma
Thanks! I will have a listen to that, it looks really interesting. I think the correct link is here
Totally agree about healthcare dynamics. Deciding what a ground-truth is going to be for a project is critical and it's not something that you can decide before you explore the data. A lot of people are put off because they don't get a clean dataset to run some stats to test a hypothesis on, but I find handling the noise is where the fun is!
nlpster3 karma
I hear you on the awful data! In addition to your points, it turns out that quite a few clinicians can't type or spell either (under time pressure).
My concerns about the automated free text analysis lagging behind is motivated from an epidemiological perspective. There is a growing body of evidence that clinical retrospective studies are going to be a bit 'iffy', if they don't use the clinical notes [1] inter alia.
Some highlights taken from [1] :
The problem is, that manually reading these notes to extract the information is very time consuming for researchers. This task, if it could be automated, would be a real win for epidemiology. And yes, it's extremely hard...
[1] Price, Sarah Jane 2016What Are We Missing by Ignoring Text Records in the Clinical Practice Research Datalink? Using Three Symptoms of Cancer as Examples to Estimate the Extent of Data in Text Format That Is Hidden to Research. https://ore.exeter.ac.uk/repository/handle/10871/21692
edit : list formatting
View HistoryShare Link