EasyESA: A Low-effort Infrastructure for Explicit Semantic Analysis (Demo)

Carvalho, Danilo; Calli, Cagatay; Freitas, André; Curry, Edward

by Danilo Carvalho, Cagatay Calli, André Freitas, Edward Curry

Abstract:

Distributional semantic models (DSMs) are semantic models which are based on the statistical analysis of co-occurrences of words in large corpora. DSMs can be used in a wide spectrum of semantic applications including semantic search, question answering, paraphrase detection, word sense disambiguation, among others. The ability to automatically harvest meaning from unstructured heterogeneous data, its simplicity of use and the ability to build comprehensive semantic models are major strengths of distributional approaches. The construction of distributional models, however, is dependent on process- ing large-scale corpora. The English version of Wikipedia 2014, for example, contains 44 GB of article data. The hardware and software infrastructure re- quirements necessary to process large-scale corpora bring high entry barriers for researchers and developers to start experimenting with distributional semantics. In order to reduce these barriers we developed EasyESA, a high-performance and easy-to-deploy distributional semantics framework and service which provides an Explicit Semantic Analysis (ESA) [4] infrastructure.

View PDF

Reference:

Danilo Carvalho, Cagatay Calli, André Freitas, Edward Curry, "EasyESA: A Low-effort Infrastructure for Explicit Semantic Analysis (Demo)", In 13th International Semantic Web Conference (ISWC 2014), Springer, Rival del Garda, pp. 177-180, 2014.

Bibtex Entry:

@inproceedings{Carvalho2014,
abstract = {Distributional semantic models (DSMs) are semantic models which are based on the statistical analysis of co-occurrences of words in large corpora. DSMs can be used in a wide spectrum of semantic applications including semantic search, question answering, paraphrase detection, word sense disambiguation, among others. The ability to automatically harvest meaning from unstructured heterogeneous data, its simplicity of use and the ability to build comprehensive semantic models are major strengths of distributional approaches. The construction of distributional models, however, is dependent on process- ing large-scale corpora. The English version of Wikipedia 2014, for example, contains 44 GB of article data. The hardware and software infrastructure re- quirements necessary to process large-scale corpora bring high entry barriers for researchers and developers to start experimenting with distributional semantics. In order to reduce these barriers we developed EasyESA, a high-performance and easy-to-deploy distributional semantics framework and service which provides an Explicit Semantic Analysis (ESA) [4] infrastructure.},
address = {Rival del Garda},
author = {Carvalho, Danilo and Calli, Cagatay and Freitas, Andr{\'{e}} and Curry, Edward},
booktitle = {13th International Semantic Web Conference (ISWC 2014)},
file = {:Users/ed/Library/Application Support/Mendeley Desktop/Downloaded/Carvalho et al. - 2014 - EasyESA A Low-effort Infrastructure for Explicit Semantic Analysis (Demo).pdf:pdf},
pages = {177--180},
publisher = {Springer},
title = {{EasyESA: A Low-effort Infrastructure for Explicit Semantic Analysis (Demo)}},
url = {http://www.edwardcurry.org/publications/easyesa_demo_2014.pdf},
year = {2014}
}