Advanced Machine Learning Models for Culturally-Agnostic Sentiment Analysis

Organization
Office of the Director of National Intelligence (ODNI)
Reference Code
ICPD-2020-10
How to Apply

Create and release your Profile on Zintellect – Postdoctoral applicants must create an account and complete a profile in the on-line application system.  Please note: your resume/CV may not exceed 2 pages.

Complete your application – Enter the rest of the information required for the IC Postdoc Program Research Opportunity. The application itself contains detailed instructions for each one of these components: availability, citizenship, transcripts, dissertation abstract, publication and presentation plan, and information about your Research Advisor co-applicant.

Additional information about the IC Postdoctoral Research Fellowship Program is available on the program website located at: https://orise.orau.gov/icpostdoc/index.html.

If you have questions, send an email to ICPostdoc@orau.org.  Please include the reference code for this opportunity in your email. 

Application Deadline
2/28/2020 6:00:00 PM Eastern Time Zone
Description

Research Topic Description, including Problem Statement:

Sentiment analysis is a field of Natural Language Processing (NLP) that deals with automating methods to quantitatively measure affect from unstructured textual data, usually surrounding a particular subject. Current methods of sentiment analysis apply numerical values to words and phrases based on a defined lexicon, or a supervised learning algorithm, and report metrics as an average over the queried string. Many currently-available models measure both the subjectivity (subjective verses objective statement) and the polarity (positive verses negative opinion) of a word or phrase. While modern methods of sentiment analysis are useful within the private sector for brand management and advertising, there are several limitations. For example, lexical disambiguation is difficult to quantify and relies on contextual clues to truly understand the sentiment of a phrase. For instance, the word “kill” can have a different polarity depending on context. Consider the sentences “Your customer service is killing me” and “You’re killing it!” In the first sentence, “kill” has an obvious negative polarity, while in the second sentence, its polarity is clearly positive. Developing comprehensive analytical models that account for lexical ambiguity remains a challenge. Culture presents another challenge to current methods of sentiment analysis. Communication styles (particularly when sharing opinions on the Internet) vary greatly across cultures, demographics, and the selected communication medium; and sentiment analysis tools are largely modeled after Western communication styles and behavior, and focus on social media. This limits the broad utility of currently-available techniques. Additionally, subjectivity and polarity may not be as relevant to non-Western cultural communication styles, thus other metrics should be evaluated. By developing more comprehensive machine learning algorithms, one can expand the applicability of sentiment analysis. Identifying parameters beyond polarity and subjectivity, which are applicable to both Western and non-Western communications styles, would enable the development of more robust sentiment analyzers with broader applicability.

Example Approaches:

There are different ways that this research can be approached. This topic broadly encompasses the fields of data science, mathematics, sociology, and linguistics; thus collaborative and versatile approaches are encouraged. Researchers could focus on developing a more robust classification algorithm for sentiment analysis that accounts for cultural online communication differences and contextual clues in addition to lexical polarity.

Proposals could consider one or more of the following:

  • How does culture influence online communication practices, and by extension, the applicability of currently-available sentiment analyzers?
  • What parameters exist, other than polarity and subjectivity, which strongly correlate with a measure of sentiment or emotional affect? How sensitive are these correlations to cultural differences and/or differences in communications media? Are there metrics that exist which are insensitive to these differences?
  • Can robust sentiment classification models be trained and/or developed that are applicable to non-Western cultures and communication media?

Relevance to the Intelligence Community:

Robust sentiment analyzers for textual data acquired from internet sources, such as news and social media, can be used to gain insight into the hearts and minds of any group of people.This could include how the group perceives a government policy, or their perception of U.S. operations and policies. A comprehensive method for analyzing sentiment surrounding U.S. policy in target populations that is culturally-agnostic is imperative for policy makers to make informed decisions. Thus, research in this area will enable significant secondary effects in future intelligence operations.

Key Words: Natural Language Processing, Sentiment Analysis, Cultural Differences, Social Media, Machine Learning, Model, NLP, Affect, Emotion, Sentiment, Text Classification, Culturally Agnostic

Qualifications

Postdoc Eligibility

  • U.S. citizens only
  • Ph.D. in a relevant field must be completed before beginning the appointment and within five years of the application deadline
  • Proposal must be associated with an accredited U.S. university, college, or U.S. government laboratory
  • Eligible candidates may only receive one award from the IC Postdoctoral Research Fellowship Program

Research Advisor Eligibility

  • Must be an employee of an accredited U.S. university, college or U.S. government laboratory
  • Are not required to be U.S. citizens
Eligibility Requirements
  • Citizenship: U.S. Citizen Only
  • Degree: Doctoral Degree.
  • Discipline(s):
    • Communications and Graphics Design (2 )
    • Computer, Information, and Data Sciences (16 )
    • Earth and Geosciences (20 )
    • Engineering (27 )
    • Environmental and Marine Sciences (15 )
    • Life Health and Medical Sciences (46 )
    • Mathematics and Statistics (11 )
    • Nanotechnology (1 )
    • Other Non-S&E (2 )
    • Other Physical Sciences (12 )
    • Physics (16 )
    • Social and Behavioral Sciences (27 )