Already a member? Log in

Sign up with your...

or

Sign Up with your email address

Add Tags

Duplicate Tags

Rename Tags

Share It With Others!

Save Link

Sign in

Sign Up with your email address

Sign up

By clicking the button, you agree to the Terms & Conditions.

Forgot Password?

Please enter your username below and press the send button.
A password reset link will be sent to you.

If you are unable to access the email address originally associated with your Delicious account, we recommend creating a new account.

ADVERTISEMENT

Links 1 through 10 of 93 by Peter Skomoroch tagged collaborative+filtering

Share It With Others!

Some people have asked for a dump of some voting data, so I made one. You can download it via bittorrent (it's hosted and seeded by S3, so don't worry about it going away) and have at. The format is

username,link_id,vote

where vote is -1 or 1 (downvote or upvote).

The dump is 29MB gzip compressed and contains 7,405,561 votes from 31,927 users over 2,046,401 links. It contains votes only from users with the preference "make my votes public" turned on (which is not the default).

This doesn't have the subreddit ID or anything in there, but I'd be willing to make another dump with more data if anything comes of this one

Share It With Others!

Share It With Others!

Share It With Others!

Share It With Others!

The sample data consists of 440,237 entries - each identifying a single user who has watched a single repository. This data contains:
Download the dataset (.zip)

* 120,867 unique repositories
* 56,555 unique users

Share It With Others!

Share It With Others!

Share It With Others!

Share It With Others!

Mayur Datar, Research Scientist at Google, Inc. describes the mapreduce algorithms used for Google News Personalization.

Share It With Others!

ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT