Share this page:

MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification

Wu Te-Lin, Shikhar Singh, Sayan Paul, Gully Burns, and Nanyun Peng, in The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021.

Download the full text


Abstract

We introduce a new dataset, MELINDA, for Multimodal Biomedical Experiment Method Classification. The dataset is collected in a fully automated distant supervision manner, where the labels are obtained from an existing curated database, and the actual contents are extracted from papers associated with each of the records in the database. We benchmark various state-of-the-art NLP and computer vision models, including unimodal models which only take either caption texts or images as inputs, and multimodal models. Our extensive experimental results show that multimodal models, despite outperforming other benchmarked models, require certain improvements especially a less-supervised way of grounding visual concepts with languages, and better transfer learning for low resource tasks. We release our dataset and the benchmarks to facilitate future research in multimodal learning, especially to motivate targeted improvements for applications in scientific domains.


Bib Entry

@inproceedings{wu2021melinda,
  title = {MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification},
  author = {Te-Lin, Wu and Singh, Shikhar and Paul, Sayan and Burns, Gully and Peng, Nanyun},
  booktitle = {The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)},
  year = {2021}
}

Related Publications

  • MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification

    Wu Te-Lin, Shikhar Singh, Sayan Paul, Gully Burns, and Nanyun Peng, in The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021.
    Full Text Abstract BibTeX Details
    We introduce a new dataset, MELINDA, for Multimodal Biomedical Experiment Method Classification. The dataset is collected in a fully automated distant supervision manner, where the labels are obtained from an existing curated database, and the actual contents are extracted from papers associated with each of the records in the database. We benchmark various state-of-the-art NLP and computer vision models, including unimodal models which only take either caption texts or images as inputs, and multimodal models. Our extensive experimental results show that multimodal models, despite outperforming other benchmarked models, require certain improvements especially a less-supervised way of grounding visual concepts with languages, and better transfer learning for low resource tasks.  We release our dataset and the benchmarks to facilitate future research in multimodal learning, especially to motivate targeted improvements for applications in scientific domains.
    @inproceedings{wu2021melinda,
      title = {MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification},
      author = {Te-Lin, Wu and Singh, Shikhar and Paul, Sayan and Burns, Gully and Peng, Nanyun},
      booktitle = {The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)},
      year = {2021}
    }
    
    Details