Ian Mulvany

November 23, 2021

A new NLP Challenge for mining biomedical literature

#NLP #future-of-knowledge #STEM #publishing 

There are too many papers, there are too many researchers, for any one person, group, or small network, to keep abreast of, so the future must be one where machines are doing a significant amount of the reading for us. 

That will require so many things, and amongst them are available high quality machine learning tools trained on just the literature that we need our machines to read. 

One way to drive that is to use the competition format to engage wider communities to look at the problem, and this is just what has been announced by the NIH National Center for Advancing Translational Sciences. 

The LitCoin challenge posts $100K in prize money, and it says:

"Biomedical researchers need to be able to use open scientific data to create new research hypotheses and lead to more treatments for more people more quickly. Reading all of the literature that could be relevant to their research topic can be daunting or even impossible, and this can lead to gaps in knowledge and duplication of effort. Transforming knowledge from biomedical literature into knowledge graphs can improve researchers’ ability to connect disparate concepts and build new hypotheses, and can allow them to discover work done by others which may be difficult to surface otherwise."

The judging criteria will be run in two phases:

"with a defined task for each phase. The first phase will focus on the annotation of biomedical concepts from free text, and the second phase will focus on creating knowledge assertions between annotated concepts."

You can enter here: https://bitgrit.net/competition/13?utm_source=NCATS&utm_medium=organic&utm_campaign=litcoin 

and the key dates are: 

  • September 20, 2021: Challenge announcement
  • November 9, 2021: Competition launch
  • December 23, 2021: End of first challenge phase
  • December 27, 2021: Start of second challenge phase
  • February 28, 2022: End of second challenge phase
  • March 11, 2022: Final source code submission deadline
  • April 8, 2022: Winners announced

I'm a fan of this format and had the great pleasure to jude a similar program a few years ago in the social sciences:

https://ocean.sagepub.com/blog/2019/1/21/final-results-in-nyus-rich-context-competition-to-be-webcast-feb-15

In the end participating teams started collaborating with each other, and good set of code was made open source, and our understanding of the problem domain was advanced. 

How to motivate people to work on these problems will continue to be a challenge, but the competition format I think will continue to be a valuable tool to draw attention to these pieces of work.