The URSA Project

Problem

Web sites commonly allow users to input reviews on a wide variety of topics. Online forums, blogging and the wiki revolution has enabled people to share their knowledge and experience with others. This information is an important asset for users deciding to buy a product, see a movie, or go to a restaurant, as well as for businesses tracking user feedback. However, most reviews are written in a free-text format, usually with very scant structured metadata information and are therefore difficult for computers to understand, analyze, and aggregate. User experience would be greatly improved if the structure of the content of the reviews were taken into account, i.e., if the parts of the reviews pertaining to different features of a product (e.g., food, atmosphere, price, service for a restaurant), as well as the sentiment of the reviewer towards each feature (e.g., positive, negative or neutral) were identified. This information, coupled with the metadata associated with a product (e.g., location and type of cuisine for restaurants), can then be used to analyze and access reviews.

Goal

The goal of the URSA project is to provide a better understanding of user reviewing patterns and to develop tools to better search, understand and access user reviews. We performed an in-depth classification and analysis of a real-world restaurant review data set mined from Citysearch, New York.

A a first step, we are developing techniques to classify and analyze text- and structure-based web reviews, and use the resulting analysis to improve personalized recommendations for web users.

Research Challenges

  • Structure Identification and Analysis
  • Text and Structure Search
  • Similarity Search in social Networks