Rexpression Generator for Dialogue Management System


Project Description

Natural Language Processing(NLP) is an active research area and the formation of international associations such as The Association for Computational Linguistics (ACL) has shown the potential of research in this field. One of the problems faced by NLP system developers is that user's language is so variable.

The type of natural language investigated in this project is in textual input rather than spoken, as spoken communication is beyond the scope of this project. This project investigates the problems relating to common human textual input errors such as spelling errors and language usages that must be addressed for a robust NLP system.

One aim of this project is to help non-computing individuals construct Perl 5 Regular Expressions (P5RE) that are used for pattern recognition in the Dialogue Management System (DM). In order to "understand" the input, the DM needs to process the string of characters. For example, a predefined question might be "I need help." If the user enters "I need helps", the DM might not understand this input because of the extra 's' after "helps". The use of regular expression in "help(s)?" can include both "help" and "helps" while the DM checks for the input.

Similarly a user with no prior knowledge of regular expression programming would simply enter a number of questions and the expression generator will generate a compact Perl 5 Regular Expression to match different types of strings with similar meanings. Using the same example of "I need help", the generator should be able to match "I need helps", "I needs help", "I need assistance", "Help me" and etc…

The research will be carried out in a number of stages:
  1. Explore or develop synonym substituter
  2. Word Graph Optimisation of common questions
  3. Perl 5 Regular Expression generator
  4. Evaluation

Research Plan

Task Time Usage
Background Reading 3-5 weeks (Semester One)
Design Methodology 2-3 weeks (Semester One)
Implementation and Testing 5-6 weeks (Semester One and Semester Two)(Semester Two)
Evaluation 3-4 weeks (Semester Two)
Write-up Rest of semester (Semester Two)

People

The following are the people that will be involved in this research.

Outcomes

A DM system environment that enables a non-computing user to prepare questions and responses with no knowledge of P5RE being required. The system will optimise the user's question to transparently create accurate and compact P5RE. The research will also evaluate the effectiveness of the system.


Resources

Dialogue Management

  1. Busemann, S., Declerck, T., Diagne, A. K., Dini, L., Klein, J. & Schmeier, S. 1997, 'Natural language dialogue service for appointment scheduling agents', Proceedings of the fifth conference on Applied natural language processing, pp. 25 - 32.
  2. Freedman, R. 2000, 'Plan-based dialogue management in a physics tutor', Proceedings of the sixth conference on Applied natural language processing, pp. 52 - 59.

Word Graph Representation / Optimisation

  1. Oerder, M. and Ney, H. 1993, 'Word graphs: an efficient interface between continuous-speech recognition and language understanding', IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 119-122.
  2. Quintana, Y., Kamel, M. & Lo, A. 1992, 'Graph-based retrieval of information in hypertext systems', Proceedings of the 10th annual international conference on Systems documentation, pp. 157 - 168.
  3. Seo, J. & Simmons, R. F. 1989, 'Syntactic graphs: a representation for the union of all ambiguous parse trees', Computational Linguistics, vol. 5, no. 1, pp. 19-32.

Synonym / Word Sense

  1. Miller, G. A. 1995, 'WordNet: A lexical database for English', Communication for the ACM, vol. 38, no. 11.

Publications

  1. Tan, S. H. 2005, 'Rexpression: Improving Matching for Script Based Dialogue Managers', Honours thesis, Department of Computing, Curtin University of Technology, Perth, Western Australia, Australia. Latex tar file

Dictionary

  1. SMS Dictionary Version 1 (7.4 MB)
  2. SMS Dictionary Version 1 Compressed (2.0 MB)

Links

  1. Mentor System.


  2. Regular Expression Checker by Xerox


  3. English to SMS Lingo Translater