1. search engines and information retrieval 2. architecture of a serarch engine 3. crawls and feeds 4. processing text 5. ranking with indexes 6. queries and interfaces ๊ฒ€์ƒ‰์—”์ง„์˜ ๊ตฌ์กฐ๋ฅผ ์•Œ์•„๋ณธ ํ›„์— ํฌ๋กค๋ง๊ณผ ํ”ผ๋“œ๋“ค์ด ์–ด๋–ป๊ฒŒ ํ…์ŠคํŠธ๋ฅผ ๊ฐ€์ ธ์˜ค๋Š”์ง€, ๊ฐ€์ ธ์˜จ ํ…์ŠคํŠธ๋“ค์„ ์–ด๋–ป๊ฒŒ ๊ฐ€๊ณตํ•˜๋Š”์ง€ ๊ฐ€๊ณต๋œ ํ…์ŠคํŠธ๋“ค์„ ๋ณด์—ฌ์ฃผ๊ธฐ ์œ„ํ•œ ์ˆœ์„œ ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ• 1. search engines and information retrieval 2. architecture of a serarch engine 1. what is an architecture? 2. basic building blocks 3. breaking it down 1. text acquisition 2. text transformation 3. index creation 4. user interaction 5. ranking 6. evaluation 4. how does it really work? 3. crawls and feeds 1. deciding what to search 2. crawling the web 1. retrieving web pages 2. the web crawler 3. freshness 4. focused crawling 5. deep web 6. sitemaps 7. distributed crawling 3. crawling documents and email 4. document feeds 5. the conversion problem 1. character encodings 6. scoring the documents 1. using a database system 2. random access 3. compression and large files 4. update 5. bigtable 7. detecting duplicates 8. removing noise 4. processing text 1. from words to terms 2. text statistics 3. document parsing 4. document structure and markup 5. information extraction 6. internationalization 5. ranking with indexes 1. overview 2. abstract model of ranking 3. inverted indexes 4. compression 5. auxiliary structures 6. index construction 7. query processing 6. queries and interfaces 1. information needs and queries 2. query transformation and refinement 1. stopping and stemming revisted 2. spell checking and suggestions 3. query expansion 4. relevance feedback 5. context and presonlization 3. showing the results 1. result pages and snippets 2. advertising and search 3. clustering the results 4. cross-language search * * * ์ค‘์š” ์šฉ์–ด ์ •๋ฆฌ pearson์ƒ๊ด€๊ณ„์ˆ˜ similarity mutual information soundex code noisy channel delta encoding, compression v-Byte encoding Zipf, corpus crawler, freshness utf-8 pull ๊ธฐ๋ฐ˜ ํ”„๋กœํ† ์ฝœ, http request, server precision query, recall 1 uri, rul dynamic document hmm, loen melon์ด๋ผ๋Š” ๋ฌธ์žฅ์œผ๋กœ๋ถ€ํ„ฐ ์ƒํ’ˆ๋ช…๊ณผ ํšŒ์‚ฌ๋ช…์„ ์ถ”์ถœ, ๊ฐ state์—์„œ ๊ฐ€๋Šฅํ•œ ์ถœ๋ ฅ์€ loen, melon๋ฟ์ด๋ฉฐ, ์ถœ๋ ฅํ™•๋ฅ ์€ p(loen|product = 0.1, p(loen|company) = 0.8 ์‹ญ์ง„์ˆ˜๋กœ ํ‘œํ˜„๋œ ์—ญ๋ฆฌ์ŠคํŠธ๋ฅผ ๋ธํƒ€ ์ธ์ฝ”๋”ฉํ•œ ํ›„, v-byte encoding, ์ตœ์ข…๊ฒฐ๊ณผ๋Š” ์ด์ง„์ˆ˜๋กœ ํ‘œํ˜„ 5๊ฐœ์˜ web pages: page1์€ page2,3์— ๋Œ€ํ•œ ๋งํฌ ์ƒ์„ฑ, page 2๋Š” 1,3์— ๋Œ€ํ•œ ๋งํฌ ์ƒ์„ฑ... ์œ„ ๋งํฌ ๊ด€๊ณ„๋ฅผ ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ random jump ํ™•๋ฅ ์ด 0.2์ผ๋•Œ 5๊ฐœ ์›น ํŽ˜์ด์ง€๋“ค์˜ page rank๊ฐ’ ๊ณ„์‚ฐ ์œ„ํ•œ equation ์•„๋ž˜ ๋‘ ๋ฌธ์„œ๋ฅผ **indexing**ํ•œ ํ›„์˜ cosine similarity ๊ณ„์‚ฐ (and๋Š” stopword, ย stemming์€ ์•ˆ ํ•จ) D1 = five thousand five hundred and fifty five dollars, D2 = fifty six thousand five hundred sixty five dollars ์ฃผ์–ด์ง„ ๋ฌธ์„œ ๋ชจ์Œ์ด ์ƒ์ˆ˜ c=0.1์ธ ์ง€ํ”„ ๋ฒ•์น™ ๋”ฐ๋ฅธ๋‹ค ๊ฐ€์ •์‹œ ๊ฐ€์žฅ ๋นˆ๋ฒˆํžˆ ์ถœํ˜„ํ•œ ์ƒ์œ„ 3๊ฐœ ๋‹จ์–ด๋“ค์„ ๋ชจ๋‘ ์ง€์šฐ๋ฉด **๋ฌธ์„œ ๋ชจ์Œ์—์„œ ์ด ์šฉ์–ด ๋ฐœ์ƒํšŸ์ˆ˜๋Š” ๋ช‡ % ๊ฐ์†Œ**? ์ด 100๋งŒ๊ฐœ์˜ ๋ฌธ์„œ๋กœ ๊ตฌ์„ฑ๋œ ๋ฌธ์„œ๋ชจ์Œ์ด ์žˆ์„ ๋•Œ, ์•„๋ž˜ ์ •๋ณด ์ด์šฉํ•ด ๋‹ค์Œ ์งˆ๋ฌธ์— ๋‹ต - ๋ฌธ์„œ๋ชจ์Œ์—์„œ ๋‹จ์–ด A๊ฐ€์ง„ ๋ฌธ์„œ ์ˆ˜๋Š” 40๋งŒ - A๊ฐ€์ง„ ๋ฌธ์„œ 10๋งŒ ๊ฐœ ์ค‘ 3๋งŒ๊ฐœ๊ฐ€ ๋‹จ์–ด B๊ฐ€์ง - ๋ฌธ์„œ๋ชจ์Œ์—์„œ ๋žœ๋คํ•˜๊ฒŒ ์„ ํƒ๋œ ๋ฌธ์„œ 20๋งŒ๊ฐœ ์‚ดํŽด๋ณธ ๊ฒฐ๊ณผ, ๊ทธ ์ค‘ 5๋งŒ๊ฐœ์˜ ๋ฌธ์„œ๊ฐ€ ๋‹จ์–ด C๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์—ˆ์Œ - ๋‹จ์–ด C 5๋งŒ๊ฐœ ์ค‘ 1๋งŒ๊ฐœ๊ฐ€ B๊ฐ€์ง A,B ๋ชจ๋‘ ๊ฐ€์ง„ ๋ฌธ์„œ์ˆ˜, A,C ๋ชจ๋‘ ๊ฐ€์ง„ ๋ฌธ์„œ ์ˆ˜, A,B,C ๋ชจ๋‘ ๊ฐ€์ง„ ๋ฌธ์„œ ์ˆ˜ ย  ๋‘ ๋‹จ์–ด์˜ co-occurrence ๋‹จ์–ด์˜ idf๊ฐ’ browser http, statelessness, scalability dns, web sever, browser, ip ์Šคํ€ด๋“œ ์›น ํ”„๋ฝ์‹œ์—์„œ๋Š” ๋งŒ๋ฃŒ ์‹œ์  ์ด์ „์— get์š”์ฒญ ๋ณด๋‚ผ ์ˆ˜ ์—†๋‹ค javascript pg์€ ๋™์  ๋ฌธ์„œ ์›น ํ”„๋ฝ์‹œ๋Š” ๋ธŒ๋ผ์šฐ์ €์— ๋Œ€ํ•ด ์„œ๋ฒ„ ์—ญํ•  ์ˆ˜ํ–‰ ์ •๋ณด๊ฒ€์ƒ‰ ์‹œ์Šคํ…œ์—์„œ ์—ฐ๊ด€๋ฌธ์„œ, 100% ์žฌํ˜„์œจ๊ณผ ์ •ํ™•๋ฅ  unicode symbol number encoding 3byte checksum์ด ๊ฐ™์€ ๋ฌธ์„œ๋Š” ์™„์ „ํžˆ ์ค‘๋ณต๋œ ๋ฌธ์„œ gentlemand, gentelman์˜ damerau-levenshtein๊ฑฐ๋ฆฌ๋Š” 1 ์Šคํ…œ ํด๋ž˜์Šค์— ๋‹ค์ด์Šค ๊ณ„์ˆ˜ ์ ์šฉ ๋ชฉ์ ์€ **ํ•˜๋‚˜์˜ ์Šคํ…œ ํด๋ž˜์Šค์— ์†ํ•œ ๋‹จ์–ด๋“ค์˜ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๊ธฐ** ์œ„ํ•ด ย  ์ด 100๋งŒ๊ฐœ ๋ฌธ์„œ ์žˆ์„ ๋•Œ a,b,c,d ๋ชจ๋‘ ํฌํ•จํ•˜๋Š” ๋ฌธ์„œ ๊ฐœ์ˆ˜ ์ถ”์ • a,b,c๊ฐ€ ๋ฌธ์„œ์—์„œ ํ•จ๊ป˜ ๋ฐœ์ƒํ•  ํ™•๋ฅ  + a,b๊ฐ€ ๋ฌธ์„œ์— ์žˆ์„๋•Œ d๊ฐ€ ๋ฌธ์„œ์— ์žˆ์„ ํ™•๋ฅ  0.5 ๋ฌธ์„œ๋ชจ์Œ์ด c=0.1์ธ ์ง€ํ”„๋ฒ•์น™ ๋”ฐ๋ฅผ๋•Œ ์ „์ฒด ๋‹จ์–ด ์ถœํ˜„๋นˆ๋„์˜ 14%์ด์ƒ ์ฐจ์ง€ํ•˜๋Š” ๋‹จ์–ด๋“ค์˜ ์ตœ์†Œ ๊ฐœ์ˆ˜? a๋Š” b,c page, b๋Š” c, c๋Š” b์— ๋Œ€ํ•œ ๋งํฌ ์ƒ์„ฑ -> graph ๋ฌด์ž‘์œ„ ์ ํ”„ ํ™•๋ฅ ์ด 0.1: web page a,b,c์˜ pagerank๊ฐ’? 1. 5๊ฐœ์˜ web page - ํฌ๋กค๋Ÿฌ๊ฐ€ a ํŽ˜์ด์ง€๋ถ€ํ„ฐ ํฌ๋กค๋ง ์‹œ์ž‘ํ•œ๋‹ค ํ•˜๊ณ , ์ด๋ฏธ ํฌ๋กค๋œ ํŽ˜์ด์ง€๋Š” ๋‹ค์‹œ ํฌ๋กคํ•˜์ง€ ์•Š์œผ๋ฉฐ, ํ•˜๋‚˜์˜ ํŽ˜์ด์ง€์— ์กด์žฌํ•˜๋Š” ๋งํฌ๋“ค์— ๋Œ€ํ•œ ํฌ๋กค๋ง ์ˆœ์„œ๋Š” ์œ„์— ์—ด๊ฑฐ๋œ ์ˆœ์„œ ๋”ฐ๋ฅผ๋•Œ ํฌ๋กค๋Ÿฌ๊ฐ€ ๋„“์ด ์šฐ์„  ํƒ์ƒ‰, ๊นŠ์ด ์šฐ์„  ํƒ์ƒ‰ํ•  ๋•Œ ํฌ๋กค๋˜๋Š” ํŽ˜์ด์ง€์˜ ์ˆœ์„œ? ์ฃผ์–ด์ง„ ์›น ๊ทธ๋ž˜ํ”„์— ์žˆ์–ด์„œ ๋„“์ด ์šฐ์„  ํƒ์ƒ‰ vs ๊นŠ์ด ์šฐ์„  ํƒ์ƒ‰์ด ํšจ๊ณผ์ ? ๋ฌธ์„œ ๋ชจ์Œ ๊ฐ€์ • D1, D2, D3 ์ค‘ ๋‹จ์–ด ๋“ฑ์žฅ ์ˆœ์„œ hidden markov model markov chain, hmm์˜ ๊ณตํ†ต์ ๊ณผ ์ฐจ์ด์ ? ๋ฌธ์„œ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„์„œ ํ‚ค์›Œ๋“œ๋ฅผ ์ถ”์ถœํ•˜๋Š” 2๊ฐœ ์ƒํƒœ๋กœ ๊ตฌ์„ฑ๋œ hmm์„ค๊ณ„ ย  2012 dns ์„œ๋ฒ„ ์ง€์ •์‹œ ์„œ๋ฒ„์˜ ๋„๋ฉ”์ธ ์ด๋ฆ„์„ ์‚ฌ์šฉํ•ด ์ง€์ • ์งˆ์˜์–ด๋“ค์„ ๋ชจ๋‘ ํฌํ•จํ•˜๋Š” ๊ฒ€์ƒ‰๊ฒฐ๊ณผ ์ง‘ํ•ฉ ํฌ๊ธฐ ์ถ”์ •์‹œ ์งˆ์˜๋” ๋‹จ์–ด ์ค‘ ๊ฐ€์žฅ ๋“œ๋ฌผ๊ฒŒ ๋“ฑ์žฅํ•˜๋Š” ๋‹จ์–ด ์ด์šฉํ•ด ์ถ”์ •ํ•˜๋Š” ์ด์œ ๋Š” ๊ฒ€ํ† ํ•  ํ‘œ๋ณธ์˜ ํฌ๊ธฐ๋ฅผ ๋Š˜๋ฆฌ๊ธฐ ์œ„ํ•ด ์Šคํ…Œ๋ฐ์€ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ๊ฐœ์ˆ˜์˜ ํ–ฅ์ƒ์— ๊ธฐ์—ฌ ์Šคํ…Œ๋ฐ์€ ์—ญ์ƒ‰์ธ์˜ ํฌ๊ธฐ๋ฅผ ์ค„์ด๋Š” ๋ฐ ๊ธฐ์—ฌ idf๋Š” ๋ฌธ์„œ๋ณ„๋กœ ๊ฐ๊ฐ ๊ณ„์‚ฐ๋˜์–ด์•ผ ํ•จ html๋ฌธ์„œ ์ด์™ธ์˜ ๋‹ค๋ฅธ ํ˜•ํƒœ์˜ ๋ฌธ์„œ๋Š” http๋กœ ์ „๋‹ฌ๋  ์ˆ˜ ์—†๋‹ค http๋ฉ”์‹œ์ง€์—๋Š” http ๋ฉ”์‹œ์ง€๋ฅผ ๋ณด๋‚ธ ๋ธŒ๋ผ์šฐ์ €๊ฐ€ ์–ด๋””์— ์žˆ๋Š”์ง€๊ฐ€ ๊ธฐ๋ก๋˜์–ด ์žˆ๋‹ค ํ”„๋ฝ์‹œ์˜ ์ฃผ์š” ๋ชฉ์ ์€ ๋ธŒ๋ผ์šฐ์ €๊ฐ€ ํ•ด์•ผํ•  ์ผ์˜ ์–‘์„ ์ค„์ด๋Š” ๊ฒƒ ๊ฒ€์ƒ‰ ์‹œ, ํ•œ ๋ฌธ์„œ์˜ ํŽ˜์ด์ง€๋žญํฌ ์ ์ˆ˜๋Š” ์ฃผ์–ด์ง„ ์งˆ์˜์–ด์— ๋ฌด๊ด€ ์ต์Šคํ…ํŠธ ๋ฆฌ์ŠคํŠธ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ, ์ด๋Š” ๊ฐ ์šฉ์–ด๋ณ„๋กœ ํ•˜๋‚˜์”ฉ ์กด์žฌํ•ด์•ผ ํ•จ ๋ฌธ์„œ์— ๋Œ€ํ•œ ์Šคํ…Œ๋ฐ ๊ทœ์น™๊ณผ ์งˆ์˜์–ด์— ๋Œ€ํ•œ ์Šคํ…Œ๋ฐ ๊ทœ์น™์€ ๋‹ค๋ฅด๋ฉด ๋‹ค๋ฅผ์ˆ˜๋ก ์ข‹๋‹ค ์ผ๋ฐ˜์ ์œผ๋กœ ๊ฒ€์ƒ‰๋ชจํ˜•์˜ ์„ฑ๋Šฅ์€ **์žฌํ˜„์œจ**์ด ์ปค์งˆ์ˆ˜๋ก **์ •ํ™•๋ฅ **๋„ ์ปค์ง„๋‹ค ์ฃผ์–ด์ง„ ํ•œ์ •๋œ ๋Œ“์ˆ˜์˜ ์ปดํ“จํ„ฐ๋กœ ํฌ๋กค๋งํ•  ๋•Œ ์ปค๋ฒ„๋ฆฌ์ง€๊ฐ€ ์ปค์งˆ์ˆ˜๋ก ์‹ ์„ ๋„๋Š” ๋–จ์–ด์งˆ ์ˆ˜๋ฐ–์— ์—†๋‹ค simhash๊ฐ’์ด ๊ฐ™์•„๋„ ๋ฌธ์„œ์˜ ๋‚ด์šฉ์ด ๋‹ค๋ฅผ ์ˆ˜ ์žˆ๋‹ค **์ปจํ…ํŠธ ๋ธ”๋ก ์ฐพ๊ธฐ ์ตœ์ ํ™” ๋ฌธ์ œ**๋Š” ์„ ํ˜•๊ณ„ํš๋ฒ•์œผ๋กœ ํ’€ ์ˆ˜ ์žˆ๋‹ค ย  ์งˆ์˜์–ด a b c๋ฅผ ๊ฐ€์ง€๋Š” ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ๋ฅผ ์ถ”์ •ํ•˜๊ณ ์ž ํ•œ๋‹ค. a๋ฅผ ํฌํ•จํ•˜๋Š” ๋ฌธ์„œ๊ฐ€ ์ด N๊ฐœ, ์ด ์ค‘ S๊ฐœ ๋ฌธ์„œ ๋ถ„์„ ๊ฒฐ๊ณผ a,b,c ๋ชจ๋‘ ํฌํ•จํ•œ ๋ฌธ์„œ์ˆ˜๊ฐ€ m๊ฐœ. ๊ฒ€์ƒ‰๊ฒฐ๊ณผ ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ๋Š” m/(s/n)๋กœ ์ถ”์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋ฐฉ์‹์˜ ์ถ”์ •์€ ์–ด๋– ํ•œ ํ™•๋ฅ ๋ก ์  ๊ทผ์„œ๋ฅผ ๊ฐ€์ง€๋Š”์ง€, a,b,c์™€ ๊ด€๋ จ๋œ ํ™•๋ฅ ๊ฐ’๋“ค์„ ์ด์šฉํ•ด ์ฆ๋ช… ๋‹ค์Œ์˜ ์‹ญ์ง„์ˆ˜๋“ค์„ ๋ธํƒ€ ์ธ์ฝ”๋”ฉํ•œ ํ›„, v-byte encoding. ์ตœ์ข…๊ฒฐ๊ณผ๋Š” ์ด์ง„์ˆ˜๋กœ ํ‘œํ˜„ ์•„๋ž˜ hmm์‚ฌ์šฉํ•ด ๋‚˜๋Š” ์„œ์šธ๋Œ€์— ํ•ฉ๊ฒฉํ–ˆ๋‹ค๋Š” ๋ฌธ์žฅ์œผ๋กœ๋ถ€ํ„ฐ ๊ธฐ๊ด€๋ช…์„ ์ถ”์ถœํ•˜๊ณ ์ž ํ•  ๋•Œ, ์„œ์šธ๋Œ€๊ฐ€ ๊ธฐ๊ด€๋ช…์œผ๋กœ ์ธ์‹๋˜๋Š”๊ฐ€? i๋Š” ์‹œ์ž‘ ์ƒํƒœ์ด๋ฉฐ, ์Šคํ…Œ๋ฐ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ ์„œ์šธ๋Œ€ ํ•ฉ๊ฒฉํ•˜๋‹ค๋กœ ๊ฐ€์ •ํ•œ ํ›„, ์ธ์‹ ์—ฌ๋ถ€์™€ ๊ทผ๊ฑฐ์ œ์‹œ 5.1 ์—ญ๋ฆฌ์ŠคํŠธ์— ์Šคํ‚จ ํฌ์ธํ„ฐ๋ฅผ ์ •์˜ํ•˜๋Š” ๋ชฉ์ ? 5.2 100๊ฐœ์˜ ํฌ์ŠคํŒ…์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์—ญ๋ฆฌ์ŠคํŠธ๊ฐ€ ์žˆ๋‹ค๊ณ  ํ•  ๋•Œ, ๊ฐœ๋ณ„ ํฌ์ŠคํŒ…์— ๊ฐ๊ฐ ์Šคํ‚ต ํฌ์ธํ„ฐ๋ฅผ ์ •์˜ํ•ด ์ด 100๊ฐœ์˜ ์Šคํ‚ต ํฌ์ธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ, ๋ฌธ์ œ์ ? 5.3 100๊ฐœ ํฌ์ŠคํŒ… ๊ฐ€์ง€๋Š” ์—ญ๋ฆฌ์ŠคํŠธ์— 10๊ฐœ์˜ ์Šคํ‚ต ํฌ์ธํ„ฐ๋ฅผ ์ •์˜ํ•˜๊ณ ์ž ํ• ๋•Œ, ์–ด๋– ํ•œ ํฌ์ŠคํŒ…์— ์Šคํ‚ต ํฌ์ธํ„ฐ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด ์ตœ์„ ? 1. ๋‘ ๊ฐœ์˜ ๊ฒ€์ƒ‰ ์—”์ง„ A, B๊ฐ€ ์žˆ์„ ๋•Œ B์˜ ์ƒ‰์ธ ํฌ๊ธฐ๊ฐ€ A์˜ ๋ช‡ ๋ฐฐ์ธ์ง€๋ฅผ ์ถ”์ •. ์‹คํ—˜ ํ†ตํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒฐ๊ณผ ์–ป์—ˆ๋‹ค ๊ฐ€์ • ์‹œ B์˜ ์ƒ‰์ธ ํฌ๊ธฐ๋Š” A ์ƒ‰์ธ ํฌ๊ธฐ์˜ ๋ช‡๋ฐฐ์ธ๊ฐ€? A์— ์ƒ‰์ธ๋œ ๋ฌธ์„œ์˜ 25%๋Š” B์—๋„ ์ƒ‰์ธ + B์— ์ƒ‰์ธ๋œ ๋ฌธ์„œ์˜ 40%๋Š” A์—๋„ ์ƒ‰์ธ 1. ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์›น ํŽ˜์ด์ง€ A๋Š” C, D์— ๋Œ€ํ•œ ๋งํฌ ์ƒ์„ฑ, ... ๋ฌด์ž‘์œ„ ์ ํ”„์˜ ํ™•๋ฅ ์ด 0.5, B,E pagerank๊ฐ’์ด 0.2, 0.1์ผ๋•Œ A์˜ pagerank๊ฐ’? ย  2011 in a trackback link used in blogs, link is directed from an old document to a new doc unlik hyperlinks http: stateful protocol that records the state about a client at the server side? cookie usually sotres entire session information about a user up to one year lng proxies are necessarily locate closer to clents that to servers jsp is a programming lang by which one can cerate a