DIAVIS-wiki
  • 追加された行はこの色です。
  • 削除された行はこの色です。
[[FrontPage]]
 
 
 :private 
 
 *A Study of Web Search Trends [#df679bbe]
 -Amanda Spink,Bernard J. Jansen 
 -Webology, Volume 1, Number 2, December, 2004
 -http://www.webology.ir/2004/v1n2/a4.html
 -ウェブ検索に関するユーザモデルやクエリ調査について.
 関連論文引っ張ってくるのに良さそう.
 
 
 
 *Real Life, Real Users, and Real Needs:A Study and Analysis of User Queries on the Web [#x6112eec]
 -http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.26.3279
 -by Major Bernard, J. Jansen, Amanda Spink, Tefko Saracevic
 -http://jimjansen.tripod.com/academic/pubs/ipm98/ipm98.pdf
 -Jansen_200.pdfで保存済み
 
 -SUMMARY
 
 The analysis involved 51,473 queries from 18,113 users, having all together 113,776 terms, of which 21,862 were
 unique terms disregarding capitalization. We provide the highlights of our findings:
 ·  The most users did not have many queries per search. The mean number of queries per user was 2.8.
 However, a sizable percentage of users did go on to either modify their original query or view subsequent
 results.
 ·  Web queries are short. On the average, a query contained 2.21 terms. Queries in searching of regular IR
 systems are some three to seven magnitudes larger. About one in three queries had one term only, two in
 three had one or two terms, and four in five had one, two or three terms. Less than 4% of the queries were
 more than 6 terms.
 ·  Relevance feedback was not used that much. About one in 20 queries used the feature More Like This. In
 comparison with professionally assisted IR searching, relevance feedback is used half as much on the Web.
 ·  Boolean operators were not frequently used. One in 18 users used any Boolean capabilities, and of those
 users that used them, every second user made a mistake, as defined by Excite rules. As to the queries,
 about one in 12 queries contained a Boolean operator, and in those AND was used by far the most. About
 one in 190 queries used nested logic. About one in every three queries that used Boolean operators or a
 parentheses was not entered as required by Excite. Web searchers are reluctant to use Boolean searches
 and when using they have great difficulty in getting them right
 ·  The ‘+’ and ‘-‘ modifiers that specify a must for presence or absence of a term were used more than
 Boolean operators. About 1 in 12 users used them. About one in 11 queries incorporated a ‘+’ or ‘-‘
 modifier. But a majority of uses were mistakes: about two out of three uses of these operators were
 incorrect. The ability to create phrases (terms enclosed by quotation marks) was seldom used – about one
 in 16 queries contained a phrase, but mistakes were negligible.
 ·  Most users searched one query only and did not follow with successive queries. The average session,
 ignoring identical queries, was 1.6. About two in three users had a single query, and 6 in 7 did not go
 beyond two queries.
 ·  On the average, users viewed 2.35 pages. Over half of users did not access result beyond the first page.
 More than three in four users did not go beyond viewing two pages
 ·  The distribution of the frequency of use of terms in queries was highly skewed. A few terms were used
 repeatedly and a lot of terms were used only once. On the top of the list, the 63 subject terms that had a
 frequency of appearance of 100 or more, represented only one third of one percent of all terms, but they
 accounted for about one of every 10 terms used in all queries. Terms that appeared only once amounted to
 a half of unique terms.
 ·  There is a lot of searching about sex on the Web, but all together it represents only a small proportion of all
 searches. When the top frequency terms are classified as to subject the top category is Sexual. As to the
 frequency of appearance, about one in every four terms in the list of 63 highest used terms can be classified
 as sexual in nature. But while sexual terms are high as a category, they still represent a very small
 proportion of all terms. A great many other subjects are searched, and the diversity of subjects searched is
 very high.
 -
 **分析結果の要約 [#m24abb8b]
 
 -クエリ:51,473 queries 
 -ユーザ:18,113 users
 -検索語:113,776 terms(ユニーク数21,862 大文字小文字区別なし) 
 
 ***クエリについて [#efa98d4f]
 Jansenらは,Exciteサーチエンジンにおける18,113ユーザからの51,473クエリを調べた結果を示している[8].その結果によると,ユーザあたりのクエリ数は2.8と多くないが,その後,自分のオリジナルのクエリを修正し始めるか,得られた結果を見ていた.また,ウェブ検索のクエリは短く,平均で,2.21語で,通常のIRシステムの検索におけるクエリにおける検索は少なく,クエリ全体の62%が,クエリにおける検索語が1語もしくは2語であった.6語より多いクエリは全体の4%以下であった.
 
 
 Relevance feedbackはあまり使われなかった.
 20クエリに一つ"More Like This"機能が使われた.
 In comparison with professionally assisted IR searching, relevance feedback is used half as much on the Web.
 
 ブーリアン演算子は頻繁には使われなかった.
 18ユーザに一人は,どんなブーリアン機能(capabilities)も使い,そして,
 
 
 クエリの分類について table1について.
 :Unique|ユーザによって最初に入力されたクエリ
 :Modified|同じユーザによって最初のクエリに加えられた,もしくは,取り除かれたクエリ
 :Identical queries|同じユーザによる以前のクエリと同一のクエリ.
 :|2つの方法がある.最初の可能性はユーザが質問を再びタイプで打ったということ.2詰めの可能性としては,Exciteによって生成された.
 
 Boolean operators were not frequently used. One in 18 users used any Boolean capabilities, and of those users that used them, every second user made a mistake, as defined by Excite rules. As to the queries, about one in 12 queries contained a Boolean operator, and in those AND was used by far the most. 
 
 
 ***結果の閲覧について [#k5a68f53]
 1996年から1999年の調査期間における70%以上,ユーザはトップ10の結果しか閲覧しなかった.平均で,ユーザが閲覧したのは,2.35ページ(1ページ当たり10件)で,50%以上のユーザが,最初のページを越えてアクセスすることはなかった.
 
 
 
 
 *http://www.cs.ucl.ac.uk/staff/A.Blandford/docs/saabjdJoDpreprint.pdf [#nb95084c]
 yangの論文(yang 1997 information seeking as problem-solving)などについて説明しているのでこれを使えばよいかも.少しひっぱって来れそう.
 
 
 *参考 [#q186ddc5]
 -「転回」p.142
 
 
トップ   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS