«ALL RIGHTS RESERVED SUPPORTING MULTIPLE INFORMATION-SEEKING STRATEGIES IN A SINGLE SYSTEM FRAMEWORK by XIAOJUN YUAN A Dissertation submitted to the ...»
ALL RIGHTS RESERVED
SUPPORTING MULTIPLE INFORMATION-SEEKING STRATEGIES
IN A SINGLE SYSTEM FRAMEWORK
A Dissertation submitted to the
Graduate School-New Brunswick
Rutgers, The State University of New Jersey
in partial fulfillment of the requirements
for the degree of Doctor of Philosophy Graduate Program in Communication, Information, and Library Studies written under the direction of Nicholas J. Belkin and approved by ________________________
New Brunswick, New Jersey October, 2007
OF THE DISSERTATION
SUPPORTING MULTIPLE INFORMATION-SEEKING STRATEGIES
IN A SINGLE SYSTEM FRAMEWORK
By XIAOJUN YUAN
Nicholas J. Belkin This study explores issues in information retrieval (IR) systems with special attention to information-seeking strategies (ISSs), the relation of ISSs to IR system design, and how to support multiple ISSs within a single system framework. It addresses the observation that people engage in a variety of ISSs within a single information-seeking episode. This study proposes to construct and evaluate an interactive IR (IIR) system which incorporates different IR support techniques to adaptively support multiple ISSs. Based on an information-seeking episode model (Belkin, 1996), and a multi-faceted classification scheme of information behaviors (Cool & Belkin, 2002), it was conducted in a series of three consecutive steps. Firstly, four experimental systems were designed
and implemented with each tailored to one of the following IR support techniques:
database summary, clustered retrieval results, table of contents navigation, and fielded query. A within-subjects experiment was conducted to compare each experimental system to its respective generic baseline system, which was constructed by following the ii current standard model with a specific query input and a ranked list of search results.
Results indicated that the experimental systems were superior to the baseline systems.
Secondly, information-seeking dialogue structures developed in the MERIT system (Belkin, Cool, Stein & Thiel, 1995) were adopted to guide the design of the IIR system.
The dialogue structures were built based on the Conversational Roles (COR) model (Sitter & Stein, 1992). Finally, an experimental system which supported multiple ISSs was built by incorporating the four IR support techniques and the dialogue structures.
This experimental system was tested in a within-subjects experiment in comparison to a generic baseline system. The experiment, with 32 subjects each searching on eight different topics, indicated that using the experimental system resulted in significantly better performance, significantly more effective interaction, and significantly better usability than the baseline system. These results demonstrated that it is possible to support quite different information-seeking behaviors within a single system framework which searchers can understand and use effectively. A principled approach to designing such systems needs to be further investigated.
My long journey to complete my doctoral dissertation would not have been possible without the support of many wonderful people, including my advisor, my committee members, my colleagues, my friends, and my family.
I would like to express my deep gratitude to my dissertation advisor, Dr. Nicholas J. Belkin, for his unparalleled insight, guidance, patience, and uncompromising emphasis on quality and meaningful research.
I would also like to thank my committee members, Dr. Susan Dumais, Dr.
Michael Lesk, Dr. Anselm Spoerri, and Dr. Chengxiang Zhai. They are all very outstanding researchers and mentors. What I have learned from them will benefit me for my lifetime.
Many thanks go to researchers in SCILS, Dr. Tefko Saracevic, Dr. Nina Wacholder, and Dr. Xiangmin Zhang, with whom I have worked on various research projects. I would also like to thank Dr. Daniel O’Connor, Dr. Claire McInerney, and Dr.
Marie Radford, who gave me invaluable suggestions on my career path.
There are researchers outside of SCILS that I would like to thank for their insightful comments on my dissertation at various doctoral consortiums (ASIST’04, SIGIR’04, HLT-NAACL’06): Macia Bates, Alan Black, Ciprian Chelba, Bruce Croft, David Harper, Caroline Haythornthwaite, Ed Hovy, Barbara Kwasnik, Liz Liddy, Yoëlle Maarek, Alistair Moffat, Javed Mostafa, Doug Oard, Keith van Rijsbergen, Steve Robertson, Linda Schamber, Henry Small, Paul Solomon and John Tait. I would also like to thank the anonymous reviewers from the SIGIR’07 conference.
research: NSF Grant #99-11942, and the Eugene Garfield Doctoral Dissertation Fellowship.
I would like to thank the researchers who assisted me with the technical suggestions and support, including David Fisher from University of Massachusetts at Amherst, Paul Ogilvie from Carnegie Mellon University, and Liang Zhou from University of Southern California. I would also like to thank Jon Oliver for his technical support over the years.
I want to thank my friends in Rutgers. During the past several years, their friendship helped me overcome a lot of difficulties, either in study or in life.
I would like to thank my parents, my husband, and my other family members.
Their selfless support is something I can always count on. We have gone though many joyful and sad moments together during my years of study. Without them, I would not be where I stand right now.
ABSTRACT OF THE DISSERTATION
TABLE OF CONTENTS
LIST OF TABLES
LIST OF ILLUSTRATIONS
2. LITERATURE REVIEW
2.1 Information-Seeking Behavior Models
2.2 IIR Models
2.4 Models of Information-Seeking Dialogue
2.5 Integrated IR Systems
3. CONCEPTUAL FRAMEWORK
3.2 Multi-dimensional Classification of ISSs
3.3 Scanning vs. Searching
3.4 Predictions about the Optimum Combination of IR Support Techniques............. 43
3.5 LEMUR Toolkit
3.6 Extended Information Interaction Model
4. RESEARCH PROBLEM 1: RESEARCH METHOD
4.1 Overall Description
4.2 Implementing and Evaluating Different Systems for Supporting Specific ISSs... 50
4.4 Situations and Tasks
4.4.1 Situation 1 (Scanning), Task 1 (T1.1, Identify best databases)
220.127.116.11 System Design (E1.1/B1.1)
4.4.2 Situation 1 (Scanning), Task 2 (T1.2, Find comments from an electronic book)
18.104.22.168 System Design (E1.2/B1.2)
4.4.3 Situation 2 (Searching), Task 1 (T2.1, Find relevant documents).................. 58 22.214.171.124 System Design (E2.1/B2.1)
4.4.4 Situation 2 (Searching), Task 2 (T2.2, Find the name of an electronic book) 61 126.96.36.199 System Design (E2.2/B2.2)
4.5 Tasks and Topics
4.6 Text Collections
4.6.1 Collection 1
vi 4.6.2 Collection 2
4.7 Experimental Design
4.9 Measures and Variables
4.10 Data Collection
5. RESEARCH PROBLEM 1: RESULTS
5.1 Pilot Results of Experiment I
5.1.1 Systems and Questionnaires
5.1.2 Preliminary Findings
5.2 Results of Experiment I
5.2.4 Pre-search Questionnaire
5.2.5 Post-search Questionnaire
5.2.6 Post-system Questionnaire
5.2.7 Exit Questionnaire
6. RESEARCH PROBLEM 1: DISCUSSION
7. RESEARCH PROBLEM 1: CONCLUSIONS
8. RESEARCH PROBLEM 2: DIALOGUE STRUCTURE AND SYSTEMDESIGN
8.1 Specifying a Dialogue Structure for Information-Seeking
8.2 Standard Introduction Session
8.3 Example Dialogue Structures for Searching/Scanning
8.4 Implementing and Evaluating an Experimental System Supporting Multiple ISSs
8.4.1 General Design Issues
8.4.2 Experimental System Design
8.4.3 Experimental System Implementation
188.8.131.52 Welcome Screen
184.108.40.206 Learn about the Overall Structure of the System
220.127.116.11 Learn about Content Coverage of Databases on Various Topics.......... 129 18.104.22.168 Search for Books on a Specific Topic
22.214.171.124 Search for News Articles on a Specific Topic
126.96.36.199 Other Features
8.4.4 Baseline System Design and Implementation
188.8.131.52 Welcome Screen
184.108.40.206 Search for Books on a Specific Topic
220.127.116.11 Search for News Articles on a Specific Topic
9. RESEARCH PROBLEM 3: RESEARCH METHOD
9.2 Integrated Situation
9.2.1 Scanning, then Searching
9.2.2 Searching, then Scanning
9.3 Tasks and Topics
9.4 Experimental Design
9.6 Measures and Variables
9.7 Data Collection
10. RESEARCH PROBLEM 3: RESULTS
10.1 Pilot Results of Experiment II
10.1.1 Systems and Tasks
10.1.2 Preliminary Findings
10.1.2.1 Usability of the Systems
10.1.2.2 Features Subjects Liked Most
10.1.2.3 Features Subjects Disliked Most
10.2 Results of Experiment II
10.2.4 System Order and Task Order Effect
10.2.5 Pre-search Questionnaire
10.2.6 Post-search Questionnaire
10.2.7 Post-system Questionnaire
10.2.8 Exit Interview
11. RESEARCH PROBLEM 3: DISCUSSION
APPENDIX A. A SAMPLE TOPIC FROM HARD 2004 CORPUS
APPENDIX B(1). CONSENT FORM (EXPERIMENT I)
APPENDIX B(2). ENTRY QUESTIONNAIRE (EXPERIMENT I)
APPENDIX B(3). PRE-SEARCH QUESTIONNAIRE (EXPERIMENT I).......... 199 APPENDIX B(4). POST-SEARCH QUESTIONNAIRE (EXPERIMENT I)........ 201 APPENDIX B(5). POST-SYSTEM QUESTIONNAIRE (EXPERIMENT I)....... 202 APPENDIX B(6). EXIT QUESTIONNAIRE (EXPERIMENT I)
APPENDIX C(1). CONSENT FORM (EXPERIMENT II)
viii APPENDIX C(2). ENTRY QUESTIONNAIRE (EXPERIMENT II)
APPENDIX C(3). PRE-SEARCH QUESTIONNAIRE ((EXPERIMENT II)....... 214 APPENDIX C(4). POST-SEARCH QUESTIONNAIRE (EXPERIMENT II)...... 216 APPENDIX C(5). POST-SYSTEM QUESTIONNAIRE (EXPERIMENT II)...... 217 APPENDIX C(6). EXIT INTERVIEW (EXPERIMENT II)
ix LIST OF TABLES
Table 2.1 Information Search Process (ISP) (after Kuhlthau, 1991)
Table 2.2 Facets of a Classification of Interactions with Information (after Cool & Belkin, 2002)
Table 2.3 Categories of Research in ISSs
Table 3.1 Possible IR Support Techniques for Each IR Process
Table 3.2 Facets of ISSs (after Belkin et al.
Table 3.3 Multi-dimensional Classification of Scanning and Searching
Table 3.4 Examples of ISSs and the Corresponding Combination of IR Support Techniques
Table 4.1 Tasks, Problems, and Possible IR Support Techniques
Table 4.2 The Relations among Situations, Tasks and Systems
Table 4.3 Two Text Collections
Table 4.4 Structure of HARD 2004 Corpus (after Allan, 2005)
Table 5.1 Subject Characteristics (Experiment I)
Table 5.2 Computer and Searching Experience of Subjects (Experiment I)
Table 5.3 Performance of Systems (Experiment I)
Table 5.4 Significance Value of Systems (Experiment I)
Table 5.5 Result Correctness across Systems (Experiment I)
Table 5.6 Variables Used to Describe Search Behavior of Interaction (Experiment I).
.. 86 Table 5.7 Mean and Standard Deviation of Interaction Variables (Experiment I).......... 86 Table 5.8 Topic Familiarity and Expertise (Experiment I)
Table 5.9 Post-Search Questionnaire Results (Experiment I)
Table 5.10 Post-System Questionnaire Results (Experiment I)
Table 5.11 System Comparison of the Exit Questionnaire (Experiment I)
Table 5.12 Comparison of the Post-system and the Exit Questionnaire (Experiment I)101 Table 5.
13 IR Support Techniques that Subjects Liked (Experiment I)
Table 5.14 IR Support Techniques that Subjects didn’t Like (Experiment I).
.............. 103 Table 6.1 Measures with Significant Results Favoring the Experimental System (Experiment I)
Table 8.1 Dialogue Structure of the Introduction Session
Table 8.2 Dialogue Structure I: for Searching
Table 8.3 Dialogue Structure II: for Scanning
Table 10.1 Subject Characteristics (Experiment II)
Table 10.2 Computer and Searching Experience of Subjects (Experiment II).
............. 158 Table 10.3 Performance of Systems (Experiment II)
Table 10.4 Significance Value of Systems (Experiment II)
Table 10.5 Time and Result Satisfaction by Task Type (Experiment II)
Table 10.6 Result Correctness across Systems (Experiment II)
Table 10.7 Variables Used to Describe Search Behavior of Interaction (Experiment II)
Table 10.8 Mean and Standard Deviation of Interaction Variables (Experiment II).
.... 163 Table 10.9 Topic Familiarity and Expertise (Experiment II)
Table 10.10 Post-search Questionnaire Results (Experiment II)
x Table 10.11 Post-system Questionnaire Results (Experiment II)