Authors
Susan Dumais, Hao Chen
Publication date
2000/7/1
Book
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Pages
256-263
Description
This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train different second-level classifiers. In the hierarchical case, a model is learned to distinguish a second-level category from other categories within the same top level. In the flat non-hierarchical case, a model distinguishes a second-level category from all other second-level categories. Scoring rules can further take advantage of the hierarchy by considering only second-level categories that exceed a threshold at the top level.
We use support vector machine (SVM) classifiers, which have been shown to be efficient and effective for classification, but not previously explored in the context of hierarchical classification. We found small advantages in accuracy for hierarchical models over flat models. For the hierarchical approach, we found the same accuracy …
Total citations
20012002200320042005200620072008200920102011201220132014201520162017201820192020202120222023202420516869867071798363816748616152523547312427226
Scholar articles
S Dumais, H Chen - Proceedings of the 23rd annual international ACM …, 2000