PErsonal social Tagging and Searchlog (PETS) - For Personalized Search with Improvement by Social Tagging Experiment


This dataset is created by Shihn-Yuarn Chen in National Chiao Tung University(NCTU), Taiwan. This dataset is still in data collection phase now (May 2009), and might be interesting for people who are doing related research.

This dataset covers 14 subjects' search log collected from Google Web History. These subjects are not in a specific domain background. The Google Web History covers the queries submitted by subjects, the clicked URLs of each query, and the time of each query and click.

Besides, we also collect the social bookmark information of these subjects from and (a social bookmark service provider in Taiwan). The social bookmark data covers the bookmarks of each subjects and the tags of each bookmark (annotated by the subject and other users).

The data collection work started from Apr. 2009, and ended in Jul. 2009.

Statistics of Dataset

The collected search logs from contain 10191 queries (181 queries/subject-month) and 11603 clicks (1.139 clicks/query).

We also resubmit the query and collect the result list (top 100 results at most), and result in 348843 URLs (278248 distinct).

Relevance Judgement

To Do List

  1. Collect a more numerous dataset, PETS-2

Contact: sychen{AT}cs_nctu_edu_tw