Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data, AND 2008, Singapore, July 24, 2008
Daniel P. Lopresti, Shourya Roy, Klaus U. Schulz, L. Venkata Subramaniam (Editors)
- Anthology ID:
- 2008.sigirconf_workshop-2008and
- Year:
- 2008
- Venue:
- sigirconf_workshop
- Publisher:
- ACM
- DBLP:
- conf/sigir/2008and
How to cope with questions typed by dyslexic users
Laurianne Sitbon
|
Patrice Bellot
Optical character recognition errors and their effects on natural language processing
Daniel P. Lopresti
Successfully detecting and correcting false friends using channel profiles
Ulrich Reffle
|
Annette Gotscharek
|
Christoph Ringlstetter
|
Klaus U. Schulz
Named entity normalization in user generated content
Valentin Jijkoun
|
Mahboob Alam Khalid
|
Maarten Marx
|
Maarten de Rijke
Rule based synonyms for entity extraction from noisy text
Rema Ananthanarayanan
|
Vijil Chenthamarakshan
|
Prasad M. Deshpande
|
Raghuram Krishnapuram
Blogger, stick to your story: modeling topical noise in blogs with coherence measures
Jiyin He
|
Wouter Weerkamp
|
Martha A. Larson
|
Maarten de Rijke
Uncovering deep user context from blogs
Robert McArthur
On profiling blogs with representative entries
Jinfeng Zhuang
|
Steven C. H. Hoi
|
Aixin Sun
A comparative study of statistical features of language in blogs-vs-splogs
Soumya Datta
|
Sudeshna Sarkar
Unsupervised learning of multilingual short message service (SMS) dialect from noisy examples
Sreangsu Acharyya
|
Sumit Negi
|
L. Venkata Subramaniam
|
Shourya Roy
Data driven methods for improving mono- and cross-lingual IR performance in noisy environments
Antti Järvelin
|
Tuomas Talvensaari
|
Anni Järvelin
Opinion mining from noisy text data
Lipika Dey
|
S. K. Mirajul Haque
Latent dirichlet allocation based multi-document summarization
Rachit Arora
|
Balaraman Ravindran
An unsupervised Hindi stemmer with heuristic improvements
Amaresh Kumar Pandey
|
Tanveer J. Siddiqui
Topic based language models for OCR correction
Anurag Bhardwaj
|
Faisal Farooq
|
Huaigu Cao
|
Venu Govindaraju
A novel Arabic lemmatization algorithm
Eiman Tamah Al-Shammari
|
Jessica Lin
Noise and information
John Tait
Some thoughts on failure analysis for noisy data
Donna Harman