Background: There is a critical need to automatically extract and synthesize knowledge and trends in nanotechnology research from an exponentially increasing body of literature. New engineered nanomaterials (ENMs), such as nanomedicines, are continuously being discovered and Natural Language Processing (NLP) approaches can semi‐automate the cataloging of ENMs and their unique physico‐chemical properties. Although lagging behind the discovery of biomedical relationships, the proposed applications of ENMs can also be linked to their physico‐chemical properties using NLP techniques. The potential for unintended consequences resulting from the commercialization of any emerging technology, including nanotechnology, underscores the need for risk assessment to keep pace with ENMs discovery and application. NLP approaches can be used to automatically aggregate studies on the exposure and hazard of ENMs as well as link the physicochemical properties to the measured effects.

Goals: The team is looking for interested sophomore, junior, and senior undergraduate Computer Science, Chemical and Life Science Engineering, and Biomedical Engineering students with a strong desire to participate in the creation of machine learning technologies.


  • Developing links to appropriate literature repositories
  • Developing techniques for author identification, document summarization, and document classification
  • Development of a quality control framework and user interface
  • Conducting user usability analysis


Contact: Prof. Bridget McInnes ( or Prof. Nastassja Lewinski (