DIAVIS-wiki
[[FrontPage]]


:private 

*A Study of Web Search Trends [#df679bbe]
-Amanda Spink,Bernard J. Jansen 
-Webology, Volume 1, Number 2, December, 2004
-http://www.webology.ir/2004/v1n2/a4.html
-ウェブ検索に関するユーザモデルやクエリ調査について.
関連論文引っ張ってくるのに良さそう.



*Real Life, Real Users, and Real Needs:A Study and Analysis of User Queries on the Web [#x6112eec]

-SUMMARY

The analysis involved 51,473 queries from 18,113 users, having all together 113,776 terms, of which 21,862 were
unique terms disregarding capitalization. We provide the highlights of our findings:
·  The most users did not have many queries per search. The mean number of queries per user was 2.8.
However, a sizable percentage of users did go on to either modify their original query or view subsequent
results.
·  Web queries are short. On the average, a query contained 2.21 terms. Queries in searching of regular IR
systems are some three to seven magnitudes larger. About one in three queries had one term only, two in
three had one or two terms, and four in five had one, two or three terms. Less than 4% of the queries were
more than 6 terms.
·  Relevance feedback was not used that much. About one in 20 queries used the feature More Like This. In
comparison with professionally assisted IR searching, relevance feedback is used half as much on the Web.
·  Boolean operators were not frequently used. One in 18 users used any Boolean capabilities, and of those
users that used them, every second user made a mistake, as defined by Excite rules. As to the queries,
about one in 12 queries contained a Boolean operator, and in those AND was used by far the most. About
one in 190 queries used nested logic. About one in every three queries that used Boolean operators or a
parentheses was not entered as required by Excite. Web searchers are reluctant to use Boolean searches
and when using they have great difficulty in getting them right
·  The ‘+’ and ‘-‘ modifiers that specify a must for presence or absence of a term were used more than
Boolean operators. About 1 in 12 users used them. About one in 11 queries incorporated a ‘+’ or ‘-‘
modifier. But a majority of uses were mistakes: about two out of three uses of these operators were
incorrect. The ability to create phrases (terms enclosed by quotation marks) was seldom used – about one
in 16 queries contained a phrase, but mistakes were negligible.
·  Most users searched one query only and did not follow with successive queries. The average session,
ignoring identical queries, was 1.6. About two in three users had a single query, and 6 in 7 did not go
beyond two queries.
·  On the average, users viewed 2.35 pages. Over half of users did not access result beyond the first page.
More than three in four users did not go beyond viewing two pages
·  The distribution of the frequency of use of terms in queries was highly skewed. A few terms were used
repeatedly and a lot of terms were used only once. On the top of the list, the 63 subject terms that had a
frequency of appearance of 100 or more, represented only one third of one percent of all terms, but they
accounted for about one of every 10 terms used in all queries. Terms that appeared only once amounted to
a half of unique terms.
·  There is a lot of searching about sex on the Web, but all together it represents only a small proportion of all
searches. When the top frequency terms are classified as to subject the top category is Sexual. As to the
frequency of appearance, about one in every four terms in the list of 63 highest used terms can be classified
as sexual in nature. But while sexual terms are high as a category, they still represent a very small
proportion of all terms. A great many other subjects are searched, and the diversity of subjects searched is
very high.



*http://www.cs.ucl.ac.uk/staff/A.Blandford/docs/saabjdJoDpreprint.pdf [#nb95084c]
yangの論文(yang 1997 information seeking as problem-solving)などについて説明しているのでこれを使えばよいかも.少しひっぱって来れそう.


*参考 [#q186ddc5]
-「転回」p.142
トップ   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS