Social tagging
Traugott Koch, UKOLN
Presentation at UKOLN 2006-02-06
Contents:
1 Introduction
2 Definitions and overviews
3 Usefulness in real systems
4 HE sector opportunities, R&D recommendations
References
1 Introduction
- Context of this discussion
- Hype
- Web 2.0, Library 2.0, Business 2.0
- speak of revolution, new era etc. (Kroski)
- Not new
- author keywords (Index Medicus etc.); user created structures (DMOZ, Yahoo directories)
- systems who invite user corrections (CiteSeer)
- author and end user metadata (thought to be unrealistic). Now: immediate personal reward
- how successful have they been?
- Risks
- commercial interests (Yahoo bought flickr, delicious etc. ): improved user and group profiles for better targetting
- big brother (US Gov subpoenas; censorship), privacy
- will not take off, frustration of contributors
- stop of other knowledge organization activities
2 Definitions and overviews
- 2.1 Terminological issues
- 16 different meanings to the term tag(ging) acc. to wikipedia
- folk - sonomies; community terminology;. tagsonomy
- fluid, culturally expressed ontologies
- is it always collaborative, social? Maybe participatory. Individual vs. social tagging
- tagging, keyword indexing, categorization, classification, faceted classification, Dublin Core as classification system (sic!)
- serendipity vs. systematic browsing
- broad vs. narrow folksonomies (Van der Wal)
- bottom-up vs. top-down and connection to analytico-synthetical resp. facets vs. hierarchical-enumerative schemas
Focus here on social tagging for topical discovery
- 2.2 Approaches exploring "social" and community activities
- Linking (Search engines ranking)
- Citation (CiteSeer, Google Scholar)
- Annotation
- Recommendation, Recommender systems (e.g. Bookmark sharing, unalog; reviews)
- Wish lists (Amazon); Reading lists; Shopping lists
- Usage:
- Popularity:
- of usage (Search engines ranking);
- of bookmarks (delicious, CiteUlike, Digg.com, FeedButler, ...)
- User behaviour, preferences, aggregate choices (Amazon; ConnectViaBooks)
- Social tagging, collaborative web tagging, user contributed metadata
- Collaborative filtering
- Social searching
- Customization, Personalization
- 2.3 Categorizations of social tagging systems
by content creator and tag users: self - others [Hammond et al]
by audience: scholarly - general [Hammond et al]
by object type
- Web pages/blogs, bookmarks (delicious, connotea, CiteULike, Technorati)
- pictures (flickr)
- music (Last.fm, Listal)
- products (Amazon product tagging; Yahoo Shoposphere)
- news (Digg, News Meme, NewsVine)
- goals (43 Things)
- friends (FOAF ?)
- advertising (Adzooks, theadcloud)
3 Social tagging systems in present practice
- Disadvantages/problems:
"Tagging bulldozes the cost of classification and piles it onto the price of discovery" (Ian Davies)
"The old way creates a tree. The new rakes leaves together" (David Weinberger)
Major:
- Undecided purpose (classification/indexing/discovery; social networking)
- Same approach for all object types (text, web pages, link lists, blogs, pictures/other media, multimedia etc.)
- No controlled vocabulary, name (first names, last names, nicknames) and other authority forms of terms (plural-singular rules)
- No rules: exhaustivity, specificity etc.
- No structure
- No context
- Not suitable for targeted effective search or systematic topical browsing
Detailed:
- Wordform (singular, plural), other morphological inconsistencies (nouns), spelling, word case, use of numbers
- Different character sets and transliteration
- Lack of synonym control
- Lack of synonym homonym control
- Bad retrieval performance: low precision (one word terms); low recall
- No phrases
- Compound construction (special character usage etc.)
- Both pre- and post-coordinated approach
- Different languages
- Uncontrolled acronyms, name forms (incl. Geographic)
- Place names: as classification or just as meaninglessly associated place
- Times, dates
- Personal context, tags not understandable by others (no interpersonal/intergroup meaning), e.g. Me, mine, (my) brother, fun
- Context missing
- Higly differing granularity of tags (both classification and keyword indexing)
- No structure, hierarchy or relationships
- Hierarchy/structure encoded in tags
- Other information encoded in tags (longitude-latitude)
- Other metadata put into tags (places, times, names, types etc. vs. topics)
- Different document types
- Highly vulnerable to abuse such as manipulation, corruption, spamming etc.
- Advantages/opportunities:
A folksonomy is "liberating, not restrictive; bottom-up, not imposed; relational, not hierarchical. It also cleverly harnesses selfish acts and directs them towards the common good. But most of all, it just seems to fit the way our brains work" (T. Hannay)
- Represents co-active intelligence (Ashman CS Nottingham)
- Derivation of meaning from the consensus arising from mass human interaction
- Bypasses the need for the explicit representation and automatic calculation of language-use rules (Ashman)
- Current, copes with fast changes
- Directly reflects users information needs, user-centered and not expert-dictated
- Language and cultural richness, reflects user language, has no information loss
- Inclusive to minorities and niche interests, without cultural, social or political bias
- Democratic
- Self-moderating via social dynamics
- Offers discovery rather than finding
- Supports serendipidy
- Supports learning
- Offers insight into user behaviour
- Forced move, unavoidable ("amateurization of cataloguing", Shirky)
- Best trade-off (between low cost/simplicity and retrieval performance)
- Cheap
- Better than nothing
- "Low-investment bridge between personal and shared classification" (McMullin)
- Most shortcomings are in fact design features
- Can create and support communities
4 HE sector opportunities, R&D recommendations
Social tagging services as we know them today are not performing well when it comes to efficient searching, or systematic browsing and discovery
They should not replace other indexing and KO efforts
4.1 Greatest benefits to expect:
- materials and publications which are largely ignored by other services
- smaller cooperating groups, specialised subjects, communication intensive work environments, fields where there are no vocabulary systems
- creating and/or improving vocabularies
- mixed media and multimedia indexing (incl. Learning objects)
- input to established systems/services as additional data
- combination with KO systems and subject access, different layer of indexing
- stimulate development and research efforts
4.2 Research:
- user behaviour related to tagging and navigation (tagging practice, influence of social environment and related/popular tags display)
- retrieval performance
- social benefit
- benefit of tagging standards
- scalability and architecture
- mass effects/intelligence
- structure of tagspace
- convergence of terminology (HP Labs: after 100 taggers describe the same website)
- integration of heterogeneous tagsets, multiple "ontologies"
- how much of the tags are covered in established systems and what is really new terminology, compare with alternative terminology creation approaches
- cf. with author and professional indexing re. Discovery improvements. Insights
- earlier research
- study new developments below
- tools and user interfaces for tagging (steve.museum research agenda; guided tagging, influence of UI, architectures, data analysis, integration into museum systems etc.)
4.3 Development: experiments:
- 4.3.1 Improve existing social tagging systems:
- define the purposes (build classification on top of improved indexing)
- user education (?)
- system support during tagging process:
- keyword extraction and proposal (Google Suggest)
- dictionary lookup
- visualization
- tag improvements by the system after initial tagging
- misspelling
- language specification
- compound treatment
- synonym linking
- WordNet, Wikipedia disambiguation
- create flat island/partial hierarchies or semantic networks
- create facets (Siderean's faceted search of delicious tags) [fac.etio.us]
- manage the tag set
- search and browse improvements: tag clusters beyond flickr clusters; co-occurrence; other aggregations; filters; ranking;
visualization (Chudnov/unalog using Starlight; flythrough navigation in Lust Digital Depot)
- user interface improvements
- do more with the tagging data
- combination with other IR features: clustering of tags; map tags to other vocabularies; create concept maps
- 4.3.2 Build alternative tagging systems:
- optimise for discovery and retrieval
- try tagging in more homogeneous services
- use of controlled vocabularies in tagging services
- offer both free tags, improved tags and controlled vocabulary for navigation
- map tags to facets, controlled vocabularies and authorities
- hook library and discovery services into social tagging systems (Lorcan)
- 4.3.3 Add/combine tagging into existing systems/services and integrate user contributions
- OPACs: (PennPal; OPACi prototype from Casey Bisson; Open WorldCat reviews)
- Subject Gateways: for resource selection and improved subject access
- Directories (Yahoo directory and social systems)
- Subject repositories: resource selection, new vocabulary, conceptual structures
- Citation services
- Digital libraries
- Search engines
- Blogs
- News services, RSS feeds
- KOS creation and development
- Museum online interactive exhibitions and object catalogues (Steve.museum; ED2 project Cambridge Univ. Museum of Anthropology)
- Metadata enhancement services (idx richeness, disambiguation)
- Retrieval views
- Views, layers: Tagging hits, controlled vocabulary hits, other
- Co-occurrence clustering
- Automatic linking via tags in social/participatory info systems to external resources
References
CiteUlike http://www.citeulike.org
Connotea http://www.connotea.org
Delicious http://del.icio.us/
Digg http://digg.com/
Flickr http://www.flickr.com/
Furl http://furl.net/
RawSugar http://rawsugar.com
Unalog http://unalog.com/
Technorati http://www.technorati.com/
Bearman, D. and Trant, J. (2005). Social Terminology Enhancement through Vernacular Engagement. Exploring Collaborative Annotation to Encourage Interaction with Museum Collections. In: D-Lib Magazine, 11:9, Sept. 2005
http://www.dlib.org/dlib/september05/bearman/09bearman.html
Bray, Tim. Do tags work? 205-03-04
http://www.tbray.org/ongoing/When/200x/2005/03/04/DoTagsWork
fac.etio.us http://www.siderean.com/facetious/facetious.jsp
Folksonomy. http://en.wikipedia.org/wiki/Folksonomy
Hammond, Tony, Hannay, Timo, Lund, Ben and Scott, Joanna. Social Bookmarking Tools (I): A General Review, D-Lib Magazine, 11(4), 2005.
http://www.dlib.org/dlib/april05/hammond/04hammond.html
Hannay, Timo: Introduction. August 19, 2004.
http://tagsonomy.com/index.php/introduction-timo-hannay/
Kroski, Ellyssa. The Hive Mind: Folksonomies and User-Based Tagging
http://infotangle.blogsome.com/2005/12/07/the-hive-mind-folksonomies-and-user-based-tagging/
Lust Digital Depot. http://www.lust.nl/lust/digitaldepot/
Mathes, Adam. Folksonomies - Cooperative Classification and Communication Through Shared Metadata
http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html
The searchguys weblog, May 13, 2005: Tags, keywords, and inconsistency
http://blogs.sun.com/roller/page/searchguy/20050513#tags_keywords_and_inconsistency
Steve.museum http://www.steve.museum/
Stephens, Michael. flickr tags http://www.flickr.com/photos/michaelsphotos/tags/
Tag patterns, from 10 bookmarking sites.
http://www.tagpatterns.com/
ex.: http://www.tagpatterns.com/tags/safari_export/all
Tagging. http://en.wikipedia.org/wiki/Tagging
Quintarelli, Emanuele: Folksonomies: Power to the People. Presented at the ISKO Italy-UniMIB meeting : Milan : June 24, 2005
http://www.iskoi.org/doc/folksonomies.htm
Weinberger, David. Taxonomies and tags: from trees to piles of leaves
http://www.hyperorg.com/blogger/misc/taxonomies_and_tags.html
Traugott Koch
Created: 2006-02-02
Last modified: 2006-02-06
URL: http://homes.ukoln.ac.uk/~tk213/pres/tagging0602.html