Already a member? Log in

Sign up with your...

or

Sign Up with your email address

Add Tags

Duplicate Tags

Rename Tags

Share It With Others!

Save Link

Sign in

Sign Up with your email address

Sign up

By clicking the button, you agree to the Terms & Conditions.

Forgot Password?

Please enter your username below and press the send button.
A password reset link will be sent to you.

If you are unable to access the email address originally associated with your Delicious account, we recommend creating a new account.

ADVERTISEMENT
ADVERTISEMENT

Links 1 through 10 of 46 by Juan Manuel Caicedo tagged textmining

Mavuno is an open source, modular, scalable text mining toolkit built upon Hadoop. It supports basic natural language processing tasks (e.g., part of speech tagging, chunking, parsing, named entity recognition), is capable of large-scale distributional similarity computations (e.g., synonym, paraphrase, and lexical variant mining), and has information extraction capabilities (e.g., instance and semantic relation mining). It can easily be adapted to new input formats and text mining tasks.

Share It With Others!

Content Analysis Web Service detects entities/concepts, categories, and relationships within unstructured content. It ranks those detected entities/concepts by their overall relevance, resolves those if possible into Wikipedia pages, and annotates tags with relevant meta-data.

Share It With Others!

SimString is a simple library for fast approximate string retrieval. Approximate string retrieval finds strings in a database whose similarity with a query string is no smaller than a threshold. Finding not only identical but similar strings, approximate string retrieval has various applications including spelling correction, flexible dictionary matching, duplicate detection, and record linkage.

Share It With Others!

This tutorial covers all aspects of building effective sentiment analysis systems for textual data, with and without sentiment-relevant metadata like star ratings. Created by Christopher Potts (Stanford Linguistics).

Share It With Others!

This is a python wrapper for crfsuite, a fast implementation of Conditional Random Fields

Share It With Others!

Share It With Others!

Command line tools for training NLTK classifiers.

Share It With Others!

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

Share It With Others!

An overview of the algorithms used for detecting duplicates in a collection of documents.

Share It With Others!

Share It With Others!

ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT