Share this page:

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Sarik Ghazarian, Ralph Weischedel, Aram Galstyan, and Nanyun Peng, in The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020.

Download the full text


Abstract


Bib Entry

@inproceedings{ghazarian2020predictive,
  title = {Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems},
  author = {Ghazarian, Sarik and Weischedel, Ralph and Galstyan, Aram and Peng, Nanyun},
  booktitle = {The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)},
  year = {2020}
}

Related Publications

  • Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation

    Sarik Ghazarian, Zixi Liu, Akash S. M, Ralph Weischedel, Aram Galstyan, and Nanyun Peng, in The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.
    Full Text Slides Code Abstract BibTeX Details
    With the recent advances of open-domain story generation models, the lack of reliable automatic evaluation metrics becomes an increasingly imperative issue that hinders the development of such models. A critical bottleneck of obtaining a trustworthy learnable evaluation metric is the lack of high-quality training data for learning classifiers to efficiently distinguish between plausible and implausible machine-generated stories. Previous works relied on heuristically manipulate plausible examples to mimic possible system drawbacks such as repetition, contradiction, or irrelevant content in the text level, which can be unnatural and oversimplify the characteristics of implausible machine-generated stories. We propose to tackle these issues by generating a more comprehensive set of implausible stories using plots, which are structured representations of controllable factors used to generate stories.  Since these plots are compact and structured, it is easier to manipulate them to generate text with targeted undesirable properties, while at the same time maintain the naturalness of the generation. To improve the quality of incoherent stories, we further apply the adversarial filtering procedure to select a more nuanced set of implausible texts. We find that the evaluation metrics trained on our generated data result in more reliable automatic assessments that correlate remarkably better with human judgments than other baselines.
    @inproceedings{ghazarian2021plot,
      title = {Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation},
      author = {Ghazarian, Sarik and Liu, Zixi and M, Akash S and Weischedel, Ralph and Galstyan, Aram and Peng, Nanyun},
      booktitle = {The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
      publisher = {Association for Computational Linguistics},
      pages = {4334–4344},
      year = {2021}
    }
    
    Details
  • Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

    Sarik Ghazarian, Ralph Weischedel, Aram Galstyan, and Nanyun Peng, in The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020.
    Full Text BibTeX Details
    @inproceedings{ghazarian2020predictive,
      title = {Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems},
      author = {Ghazarian, Sarik and Weischedel, Ralph and Galstyan, Aram and Peng, Nanyun},
      booktitle = {The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)},
      year = {2020}
    }
    
    Details
  • Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings

    Sarik Ghazarian, Johnny Tian-Zheng Wei, Aram Galstyan, and Nanyun Peng, in 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), NeuralGen Workshop, 2019.
    Full Text BibTeX Details
    @inproceedings{ghazarian2019better,
      title = {Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings},
      author = {Ghazarian, Sarik and Wei, Johnny Tian-Zheng and Galstyan, Aram and Peng, Nanyun},
      booktitle = {2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), NeuralGen Workshop},
      year = {2019}
    }
    
    Details
  • Evaluating and Enhancing the Robustness of Retrieval-Based Dialogue Systems with Adversarial Examples

    Jia Li, Chongyang Tao, Nanyun Peng, Wei Wu, Dongyan Zhao, and Rui Yan, in CCF International Conference on Natural Language Processing and Chinese Computing, 2019.
    Full Text BibTeX Details
    @inproceedings{li2019evaluating,
      title = {Evaluating and Enhancing the Robustness of Retrieval-Based Dialogue Systems with Adversarial Examples},
      author = {Li, Jia and Tao, Chongyang and Peng, Nanyun and Wu, Wei and Zhao, Dongyan and Yan, Rui},
      booktitle = {CCF International Conference on Natural Language Processing and Chinese Computing},
      pages = {142--154},
      year = {2019},
      organization = {Springer}
    }
    
    Details