id sid tid token lemma pos ital-3222 1 1 the the DET ital-3222 1 2 efficient efficient ADJ ital-3222 1 3 storage storage NOUN ital-3222 1 4 of of ADP ital-3222 1 5 text text NOUN ital-3222 1 6 documents document NOUN ital-3222 1 7 in in ADP ital-3222 1 8 digital digital ADJ ital-3222 1 9 libraries library NOUN ital-3222 1 10 | | NOUN ital-3222 1 11 skibiński skibiński NOUN ital-3222 1 12 and and CCONJ ital-3222 1 13 swacha swacha VERB ital-3222 1 14 143 143 NUM ital-3222 1 15 przemysław przemysław PROPN ital-3222 1 16 skibiński skibiński NOUN ital-3222 1 17 and and CCONJ ital-3222 1 18 jakub jakub ADJ ital-3222 1 19 swacha swacha PROPN ital-3222 1 20 the the DET ital-3222 1 21 efficient efficient ADJ ital-3222 1 22 storage storage NOUN ital-3222 1 23 of of ADP ital-3222 1 24 text text NOUN ital-3222 1 25 documents document NOUN ital-3222 1 26 in in ADP ital-3222 1 27 digital digital ADJ ital-3222 1 28 libraries library NOUN ital-3222 1 29 przemysław przemysław ADJ ital-3222 1 30 skibiński skibiński NOUN ital-3222 1 31 ( ( PUNCT ital-3222 1 32 inikep@ii.uni.wroc.pl inikep@ii.uni.wroc.pl X ital-3222 1 33 ) ) PUNCT ital-3222 1 34 is be AUX ital-3222 1 35 [ [ X ital-3222 1 36 qy qy ADP ital-3222 1 37 : : PUNCT ital-3222 1 38 title title NOUN ital-3222 1 39 ? ? PUNCT ital-3222 1 40 ] ] PUNCT ital-3222 1 41 , , PUNCT ital-3222 1 42 institute institute PROPN ital-3222 1 43 of of ADP ital-3222 1 44 computer computer NOUN ital-3222 1 45 science science NOUN ital-3222 1 46 , , PUNCT ital-3222 1 47 university university NOUN ital-3222 1 48 of of ADP ital-3222 1 49 wrocław wrocław PROPN ital-3222 1 50 , , PUNCT ital-3222 1 51 poland poland PROPN ital-3222 1 52 . . PUNCT ital-3222 2 1 jakub jakub PROPN ital-3222 2 2 swacha swacha PROPN ital-3222 2 3 ( ( PUNCT ital-3222 2 4 jakubs@uoo.univ.szczecin.pl jakubs@uoo.univ.szczecin.pl PROPN ital-3222 2 5 ) ) PUNCT ital-3222 2 6 is be AUX ital-3222 2 7 [ [ X ital-3222 2 8 qy qy ADP ital-3222 2 9 : : PUNCT ital-3222 2 10 title title NOUN ital-3222 2 11 ? ? PUNCT ital-3222 2 12 ] ] PUNCT ital-3222 2 13 , , PUNCT ital-3222 2 14 institute institute NOUN ital-3222 2 15 of of ADP ital-3222 2 16 information information NOUN ital-3222 2 17 technology technology NOUN ital-3222 2 18 in in ADP ital-3222 2 19 management management NOUN ital-3222 2 20 , , PUNCT ital-3222 2 21 university university PROPN ital-3222 2 22 of of ADP ital-3222 2 23 szczecin szczecin PROPN ital-3222 2 24 , , PUNCT ital-3222 2 25 poland poland PROPN ital-3222 2 26 . . PUNCT ital-3222 3 1 przemysław przemysław PROPN ital-3222 3 2 skibiński skibiński PROPN ital-3222 3 3 and and CCONJ ital-3222 3 4 jakub jakub ADJ ital-3222 3 5 swacha swacha PROPN ital-3222 3 6 the the DET ital-3222 3 7 efficient efficient ADJ ital-3222 3 8 storage storage NOUN ital-3222 3 9 of of ADP ital-3222 3 10 text text NOUN ital-3222 3 11 documents document NOUN ital-3222 3 12 in in ADP ital-3222 3 13 digital digital ADJ ital-3222 3 14 libraries library NOUN ital-3222 3 15 in in ADP ital-3222 3 16 this this DET ital-3222 3 17 paper paper NOUN ital-3222 3 18 we we PRON ital-3222 3 19 investigate investigate VERB ital-3222 3 20 the the DET ital-3222 3 21 possibility possibility NOUN ital-3222 3 22 of of ADP ital-3222 3 23 improving improve VERB ital-3222 3 24 the the DET ital-3222 3 25 efficiency efficiency NOUN ital-3222 3 26 of of ADP ital-3222 3 27 data data NOUN ital-3222 3 28 compression compression NOUN ital-3222 3 29 , , PUNCT ital-3222 3 30 and and CCONJ ital-3222 3 31 thus thus ADV ital-3222 3 32 reducing reduce VERB ital-3222 3 33 storage storage NOUN ital-3222 3 34 requirements requirement NOUN ital-3222 3 35 , , PUNCT ital-3222 3 36 for for ADP ital-3222 3 37 seven seven NUM ital-3222 3 38 widely widely ADV ital-3222 3 39 used use VERB ital-3222 3 40 text text NOUN ital-3222 3 41 document document NOUN ital-3222 3 42 formats format NOUN ital-3222 3 43 . . PUNCT ital-3222 4 1 we we PRON ital-3222 4 2 propose propose VERB ital-3222 4 3 an an DET ital-3222 4 4 open open ADJ ital-3222 4 5 - - PUNCT ital-3222 4 6 source source NOUN ital-3222 4 7 text text NOUN ital-3222 4 8 compression compression NOUN ital-3222 4 9 software software NOUN ital-3222 4 10 library library NOUN ital-3222 4 11 , , PUNCT ital-3222 4 12 featuring feature VERB ital-3222 4 13 an an DET ital-3222 4 14 advanced advanced ADJ ital-3222 4 15 word word NOUN ital-3222 4 16 - - PUNCT ital-3222 4 17 substitution substitution NOUN ital-3222 4 18 scheme scheme NOUN ital-3222 4 19 with with ADP ital-3222 4 20 static static ADJ ital-3222 4 21 and and CCONJ ital-3222 4 22 semidynamic semidynamic ADJ ital-3222 4 23 word word NOUN ital-3222 4 24 dictionaries dictionary NOUN ital-3222 4 25 . . PUNCT ital-3222 5 1 the the DET ital-3222 5 2 empirical empirical ADJ ital-3222 5 3 results result NOUN ital-3222 5 4 show show VERB ital-3222 5 5 an an DET ital-3222 5 6 average average ADJ ital-3222 5 7 storage storage NOUN ital-3222 5 8 space space NOUN ital-3222 5 9 reduction reduction NOUN ital-3222 5 10 as as ADV ital-3222 5 11 high high ADJ ital-3222 5 12 as as ADP ital-3222 5 13 78 78 NUM ital-3222 5 14 percent percent NOUN ital-3222 5 15 compared compare VERB ital-3222 5 16 to to ADP ital-3222 5 17 uncompressed uncompressed ADJ ital-3222 5 18 documents document NOUN ital-3222 5 19 , , PUNCT ital-3222 5 20 and and CCONJ ital-3222 5 21 as as ADV ital-3222 5 22 high high ADJ ital-3222 5 23 as as ADP ital-3222 5 24 30 30 NUM ital-3222 5 25 percent percent NOUN ital-3222 5 26 compared compare VERB ital-3222 5 27 to to ADP ital-3222 5 28 documents document NOUN ital-3222 5 29 compressed compress VERB ital-3222 5 30 with with ADP ital-3222 5 31 the the DET ital-3222 5 32 free free ADJ ital-3222 5 33 compression compression NOUN ital-3222 5 34 software software NOUN ital-3222 5 35 gzip gzip NOUN ital-3222 5 36 . . PUNCT ital-3222 6 1 i i PRON ital-3222 6 2 t t NOUN ital-3222 6 3 is be AUX ital-3222 6 4 hard hard ADJ ital-3222 6 5 to to PART ital-3222 6 6 expect expect VERB ital-3222 6 7 the the DET ital-3222 6 8 continuing continue VERB ital-3222 6 9 rapid rapid ADJ ital-3222 6 10 growth growth NOUN ital-3222 6 11 of of ADP ital-3222 6 12 global global ADJ ital-3222 6 13 information information NOUN ital-3222 6 14 volume volume NOUN ital-3222 6 15 not not PART ital-3222 6 16 to to PART ital-3222 6 17 affect affect VERB ital-3222 6 18 digital digital ADJ ital-3222 6 19 libraries.1 libraries.1 NOUN ital-3222 6 20 the the DET ital-3222 6 21 growth growth NOUN ital-3222 6 22 of of ADP ital-3222 6 23 stored store VERB ital-3222 6 24 information information NOUN ital-3222 6 25 volume volume NOUN ital-3222 6 26 means mean VERB ital-3222 6 27 growth growth NOUN ital-3222 6 28 in in ADP ital-3222 6 29 storage storage NOUN ital-3222 6 30 requirements requirement NOUN ital-3222 6 31 , , PUNCT ital-3222 6 32 which which PRON ital-3222 6 33 poses pose VERB ital-3222 6 34 a a DET ital-3222 6 35 problem problem NOUN ital-3222 6 36 in in ADP ital-3222 6 37 both both CCONJ ital-3222 6 38 technological technological ADJ ital-3222 6 39 and and CCONJ ital-3222 6 40 economic economic ADJ ital-3222 6 41 terms term NOUN ital-3222 6 42 . . PUNCT ital-3222 7 1 fortunately fortunately ADV ital-3222 7 2 , , PUNCT ital-3222 7 3 the the DET ital-3222 7 4 digital digital ADJ ital-3222 7 5 librarys librarys PROPN ital-3222 7 6 ’ ’ PART ital-3222 7 7 hunger hunger NOUN ital-3222 7 8 for for ADP ital-3222 7 9 resources resource NOUN ital-3222 7 10 can can AUX ital-3222 7 11 be be AUX ital-3222 7 12 tamed tame VERB ital-3222 7 13 with with ADP ital-3222 7 14 data datum NOUN ital-3222 7 15 compression.2 compression.2 ADP ital-3222 7 16 the the DET ital-3222 7 17 primary primary ADJ ital-3222 7 18 motivation motivation NOUN ital-3222 7 19 for for ADP ital-3222 7 20 our our PRON ital-3222 7 21 research research NOUN ital-3222 7 22 was be AUX ital-3222 7 23 to to PART ital-3222 7 24 limit limit VERB ital-3222 7 25 the the DET ital-3222 7 26 data data NOUN ital-3222 7 27 storage storage NOUN ital-3222 7 28 requirements requirement NOUN ital-3222 7 29 of of ADP ital-3222 7 30 the the DET ital-3222 7 31 student student NOUN ital-3222 7 32 thesis thesis NOUN ital-3222 7 33 electronic electronic ADJ ital-3222 7 34 archive archive NOUN ital-3222 7 35 in in ADP ital-3222 7 36 the the DET ital-3222 7 37 institute institute PROPN ital-3222 7 38 of of ADP ital-3222 7 39 information information NOUN ital-3222 7 40 technology technology NOUN ital-3222 7 41 in in ADP ital-3222 7 42 management management NOUN ital-3222 7 43 at at ADP ital-3222 7 44 the the DET ital-3222 7 45 university university PROPN ital-3222 7 46 of of ADP ital-3222 7 47 szczecin szczecin PROPN ital-3222 7 48 . . PUNCT ital-3222 8 1 the the DET ital-3222 8 2 current current ADJ ital-3222 8 3 regulations regulation NOUN ital-3222 8 4 state state NOUN ital-3222 8 5 that that SCONJ ital-3222 8 6 every every DET ital-3222 8 7 thesis thesis NOUN ital-3222 8 8 should should AUX ital-3222 8 9 be be AUX ital-3222 8 10 submitted submit VERB ital-3222 8 11 in in ADP ital-3222 8 12 both both PRON ital-3222 8 13 printed printed ADJ ital-3222 8 14 and and CCONJ ital-3222 8 15 electronic electronic ADJ ital-3222 8 16 form form NOUN ital-3222 8 17 . . PUNCT ital-3222 9 1 the the DET ital-3222 9 2 latter latter ADJ ital-3222 9 3 facilitates facilitate VERB ital-3222 9 4 automated automate VERB ital-3222 9 5 processing processing NOUN ital-3222 9 6 of of ADP ital-3222 9 7 the the DET ital-3222 9 8 documents document NOUN ital-3222 9 9 for for ADP ital-3222 9 10 purposes purpose NOUN ital-3222 9 11 such such ADJ ital-3222 9 12 as as ADP ital-3222 9 13 plagiarism plagiarism NOUN ital-3222 9 14 detection detection NOUN ital-3222 9 15 or or CCONJ ital-3222 9 16 statistical statistical ADJ ital-3222 9 17 language language NOUN ital-3222 9 18 analysis analysis NOUN ital-3222 9 19 . . PUNCT ital-3222 10 1 considering consider VERB ital-3222 10 2 the the DET ital-3222 10 3 introduction introduction NOUN ital-3222 10 4 of of ADP ital-3222 10 5 the the DET ital-3222 10 6 three three NUM ital-3222 10 7 - - PUNCT ital-3222 10 8 cycle cycle NOUN ital-3222 10 9 higher high ADJ ital-3222 10 10 education education NOUN ital-3222 10 11 system system NOUN ital-3222 10 12 ( ( PUNCT ital-3222 10 13 bachelor bachelor NOUN ital-3222 10 14 / / SYM ital-3222 10 15 master master NOUN ital-3222 10 16 / / SYM ital-3222 10 17 doctorate doctorate NOUN ital-3222 10 18 ) ) PUNCT ital-3222 10 19 , , PUNCT ital-3222 10 20 there there PRON ital-3222 10 21 are be VERB ital-3222 10 22 several several ADJ ital-3222 10 23 hundred hundred NUM ital-3222 10 24 theses thesis NOUN ital-3222 10 25 added add VERB ital-3222 10 26 to to ADP ital-3222 10 27 the the DET ital-3222 10 28 archive archive NOUN ital-3222 10 29 every every DET ital-3222 10 30 year year NOUN ital-3222 10 31 . . PUNCT ital-3222 11 1 although although SCONJ ital-3222 11 2 students student NOUN ital-3222 11 3 are be AUX ital-3222 11 4 asked ask VERB ital-3222 11 5 to to PART ital-3222 11 6 submit submit VERB ital-3222 11 7 microsoft microsoft PROPN ital-3222 11 8 word word NOUN ital-3222 11 9 – – PUNCT ital-3222 11 10 compatible compatible ADJ ital-3222 11 11 documents document NOUN ital-3222 11 12 such such ADJ ital-3222 11 13 as as ADP ital-3222 11 14 doc doc PROPN ital-3222 11 15 , , PUNCT ital-3222 11 16 docx docx NOUN ital-3222 11 17 , , PUNCT ital-3222 11 18 and and CCONJ ital-3222 11 19 rtf rtf ADJ ital-3222 11 20 , , PUNCT ital-3222 11 21 other other ADJ ital-3222 11 22 popular popular ADJ ital-3222 11 23 formats format NOUN ital-3222 11 24 such such ADJ ital-3222 11 25 as as ADP ital-3222 11 26 tex tex NOUN ital-3222 11 27 script script NOUN ital-3222 11 28 ( ( PUNCT ital-3222 11 29 tex tex PROPN ital-3222 11 30 ) ) PUNCT ital-3222 11 31 , , PUNCT ital-3222 11 32 html html PROPN ital-3222 11 33 , , PUNCT ital-3222 11 34 ps ps PROPN ital-3222 11 35 , , PUNCT ital-3222 11 36 and and CCONJ ital-3222 11 37 pdf pdf NOUN ital-3222 11 38 are be AUX ital-3222 11 39 also also ADV ital-3222 11 40 accepted accept VERB ital-3222 11 41 , , PUNCT ital-3222 11 42 both both PRON ital-3222 11 43 in in ADP ital-3222 11 44 the the DET ital-3222 11 45 case case NOUN ital-3222 11 46 of of ADP ital-3222 11 47 the the DET ital-3222 11 48 main main ADJ ital-3222 11 49 thesis thesis NOUN ital-3222 11 50 document document NOUN ital-3222 11 51 , , PUNCT ital-3222 11 52 containing contain VERB ital-3222 11 53 the the DET ital-3222 11 54 thesis thesis NOUN ital-3222 11 55 and and CCONJ ital-3222 11 56 any any DET ital-3222 11 57 appendixes appendix NOUN ital-3222 11 58 that that PRON ital-3222 11 59 were be AUX ital-3222 11 60 included include VERB ital-3222 11 61 in in ADP ital-3222 11 62 the the DET ital-3222 11 63 printed print VERB ital-3222 11 64 version version NOUN ital-3222 11 65 , , PUNCT ital-3222 11 66 and and CCONJ ital-3222 11 67 the the DET ital-3222 11 68 additional additional ADJ ital-3222 11 69 appendixes appendix NOUN ital-3222 11 70 , , PUNCT ital-3222 11 71 comprising comprise VERB ital-3222 11 72 materials material NOUN ital-3222 11 73 that that PRON ital-3222 11 74 were be AUX ital-3222 11 75 left leave VERB ital-3222 11 76 out out ADP ital-3222 11 77 of of ADP ital-3222 11 78 the the DET ital-3222 11 79 printed print VERB ital-3222 11 80 version version NOUN ital-3222 11 81 ( ( PUNCT ital-3222 11 82 such such ADJ ital-3222 11 83 as as ADP ital-3222 11 84 detailed detailed ADJ ital-3222 11 85 data datum NOUN ital-3222 11 86 tables table NOUN ital-3222 11 87 , , PUNCT ital-3222 11 88 the the DET ital-3222 11 89 full full ADJ ital-3222 11 90 source source NOUN ital-3222 11 91 code code NOUN ital-3222 11 92 of of ADP ital-3222 11 93 programs program NOUN ital-3222 11 94 , , PUNCT ital-3222 11 95 program program NOUN ital-3222 11 96 manuals manual NOUN ital-3222 11 97 , , PUNCT ital-3222 11 98 etc etc X ital-3222 11 99 . . X ital-3222 11 100 ) ) PUNCT ital-3222 11 101 . . PUNCT ital-3222 12 1 some some PRON ital-3222 12 2 of of ADP ital-3222 12 3 the the DET ital-3222 12 4 appendixes appendix NOUN ital-3222 12 5 may may AUX ital-3222 12 6 be be AUX ital-3222 12 7 multimedia multimedia NOUN ital-3222 12 8 , , PUNCT ital-3222 12 9 in in ADP ital-3222 12 10 formats format NOUN ital-3222 12 11 such such ADJ ital-3222 12 12 as as ADP ital-3222 12 13 png png NOUN ital-3222 12 14 , , PUNCT ital-3222 12 15 jpeg jpeg PROPN ital-3222 12 16 , , PUNCT ital-3222 12 17 or or CCONJ ital-3222 12 18 mpeg.3 mpeg.3 PROPN ital-3222 12 19 notice notice VERB ital-3222 12 20 that that SCONJ ital-3222 12 21 this this DET ital-3222 12 22 paper paper NOUN ital-3222 12 23 deals deal VERB ital-3222 12 24 with with ADP ital-3222 12 25 text text NOUN ital-3222 12 26 - - PUNCT ital-3222 12 27 document document NOUN ital-3222 12 28 compression compression NOUN ital-3222 12 29 only only ADV ital-3222 12 30 . . PUNCT ital-3222 13 1 although although SCONJ ital-3222 13 2 the the DET ital-3222 13 3 size size NOUN ital-3222 13 4 of of ADP ital-3222 13 5 individual individual ADJ ital-3222 13 6 text text NOUN ital-3222 13 7 documents document NOUN ital-3222 13 8 is be AUX ital-3222 13 9 often often ADV ital-3222 13 10 significantly significantly ADV ital-3222 13 11 smaller small ADJ ital-3222 13 12 than than ADP ital-3222 13 13 the the DET ital-3222 13 14 size size NOUN ital-3222 13 15 of of ADP ital-3222 13 16 individual individual ADJ ital-3222 13 17 multimedia multimedia NOUN ital-3222 13 18 objects object NOUN ital-3222 13 19 , , PUNCT ital-3222 13 20 their their PRON ital-3222 13 21 collective collective ADJ ital-3222 13 22 volume volume NOUN ital-3222 13 23 is be AUX ital-3222 13 24 large large ADJ ital-3222 13 25 enough enough ADV ital-3222 13 26 to to PART ital-3222 13 27 make make VERB ital-3222 13 28 the the DET ital-3222 13 29 compression compression NOUN ital-3222 13 30 effort effort NOUN ital-3222 13 31 worthwhile worthwhile ADJ ital-3222 13 32 . . PUNCT ital-3222 14 1 the the DET ital-3222 14 2 reason reason NOUN ital-3222 14 3 for for ADP ital-3222 14 4 focusing focus VERB ital-3222 14 5 on on ADP ital-3222 14 6 text text NOUN ital-3222 14 7 - - PUNCT ital-3222 14 8 document document NOUN ital-3222 14 9 compression compression NOUN ital-3222 14 10 is be AUX ital-3222 14 11 that that SCONJ ital-3222 14 12 most most ADJ ital-3222 14 13 multimedia multimedia NOUN ital-3222 14 14 formats format NOUN ital-3222 14 15 have have VERB ital-3222 14 16 efficient efficient ADJ ital-3222 14 17 compression compression NOUN ital-3222 14 18 schemes scheme NOUN ital-3222 14 19 embedded embed VERB ital-3222 14 20 , , PUNCT ital-3222 14 21 whereas whereas SCONJ ital-3222 14 22 text text NOUN ital-3222 14 23 document document NOUN ital-3222 14 24 formats format NOUN ital-3222 14 25 usually usually ADV ital-3222 14 26 either either CCONJ ital-3222 14 27 are be AUX ital-3222 14 28 uncompressed uncompressed ADJ ital-3222 14 29 or or CCONJ ital-3222 14 30 use use VERB ital-3222 14 31 schemes scheme NOUN ital-3222 14 32 with with ADP ital-3222 14 33 efficiency efficiency NOUN ital-3222 14 34 far far ADV ital-3222 14 35 worse bad ADJ ital-3222 14 36 than than ADP ital-3222 14 37 the the DET ital-3222 14 38 current current ADJ ital-3222 14 39 state state NOUN ital-3222 14 40 of of ADP ital-3222 14 41 the the DET ital-3222 14 42 art art NOUN ital-3222 14 43 in in ADP ital-3222 14 44 text text NOUN ital-3222 14 45 compression compression NOUN ital-3222 14 46 . . PUNCT ital-3222 15 1 although although SCONJ ital-3222 15 2 the the DET ital-3222 15 3 student student NOUN ital-3222 15 4 thesis thesis NOUN ital-3222 15 5 electronic electronic ADJ ital-3222 15 6 archive archive NOUN ital-3222 15 7 was be AUX ital-3222 15 8 our our PRON ital-3222 15 9 motivation motivation NOUN ital-3222 15 10 , , PUNCT ital-3222 15 11 we we PRON ital-3222 15 12 propose propose VERB ital-3222 15 13 a a DET ital-3222 15 14 solution solution NOUN ital-3222 15 15 that that PRON ital-3222 15 16 can can AUX ital-3222 15 17 be be AUX ital-3222 15 18 applied apply VERB ital-3222 15 19 to to ADP ital-3222 15 20 any any DET ital-3222 15 21 digital digital ADJ ital-3222 15 22 library library NOUN ital-3222 15 23 containing contain VERB ital-3222 15 24 text text NOUN ital-3222 15 25 documents document NOUN ital-3222 15 26 . . PUNCT ital-3222 16 1 as as SCONJ ital-3222 16 2 the the DET ital-3222 16 3 recent recent ADJ ital-3222 16 4 survey survey NOUN ital-3222 16 5 by by ADP ital-3222 16 6 kahl kahl NOUN ital-3222 16 7 and and CCONJ ital-3222 16 8 williams williams PROPN ital-3222 16 9 revealed reveal VERB ital-3222 16 10 , , PUNCT ital-3222 16 11 57.5 57.5 NUM ital-3222 16 12 percent percent NOUN ital-3222 16 13 of of ADP ital-3222 16 14 the the DET ital-3222 16 15 examined examine VERB ital-3222 16 16 1,117 1,117 NUM ital-3222 16 17 digital digital ADJ ital-3222 16 18 library library NOUN ital-3222 16 19 projects project NOUN ital-3222 16 20 consisted consist VERB ital-3222 16 21 of of ADP ital-3222 16 22 text text NOUN ital-3222 16 23 content content NOUN ital-3222 16 24 , , PUNCT ital-3222 16 25 so so ADV ital-3222 16 26 there there PRON ital-3222 16 27 are be VERB ital-3222 16 28 numerous numerous ADJ ital-3222 16 29 libraries library NOUN ital-3222 16 30 that that PRON ital-3222 16 31 could could AUX ital-3222 16 32 benefit benefit VERB ital-3222 16 33 form form NOUN ital-3222 16 34 implementation implementation NOUN ital-3222 16 35 of of ADP ital-3222 16 36 the the DET ital-3222 16 37 proposed propose VERB ital-3222 16 38 scheme.4 scheme.4 NOUN ital-3222 16 39 in in ADP ital-3222 16 40 this this DET ital-3222 16 41 paper paper NOUN ital-3222 16 42 , , PUNCT ital-3222 16 43 we we PRON ital-3222 16 44 describe describe VERB ital-3222 16 45 a a DET ital-3222 16 46 state state NOUN ital-3222 16 47 - - PUNCT ital-3222 16 48 of of ADP ital-3222 16 49 - - PUNCT ital-3222 16 50 the the DET ital-3222 16 51 - - PUNCT ital-3222 16 52 art art NOUN ital-3222 16 53 approach approach NOUN ital-3222 16 54 to to ADP ital-3222 16 55 text text NOUN ital-3222 16 56 - - PUNCT ital-3222 16 57 document document NOUN ital-3222 16 58 compression compression NOUN ital-3222 16 59 and and CCONJ ital-3222 16 60 present present VERB ital-3222 16 61 an an DET ital-3222 16 62 opensource opensource NOUN ital-3222 16 63 software software NOUN ital-3222 16 64 library library NOUN ital-3222 16 65 implementing implement VERB ital-3222 16 66 the the DET ital-3222 16 67 scheme scheme NOUN ital-3222 16 68 that that PRON ital-3222 16 69 can can AUX ital-3222 16 70 be be AUX ital-3222 16 71 freely freely ADV ital-3222 16 72 used use VERB ital-3222 16 73 in in ADP ital-3222 16 74 digital digital ADJ ital-3222 16 75 library library NOUN ital-3222 16 76 projects project NOUN ital-3222 16 77 . . PUNCT ital-3222 17 1 in in ADP ital-3222 17 2 the the DET ital-3222 17 3 case case NOUN ital-3222 17 4 of of ADP ital-3222 17 5 text text NOUN ital-3222 17 6 documents document NOUN ital-3222 17 7 , , PUNCT ital-3222 17 8 improvement improvement NOUN ital-3222 17 9 in in ADP ital-3222 17 10 compression compression NOUN ital-3222 17 11 effectiveness effectiveness NOUN ital-3222 17 12 may may AUX ital-3222 17 13 be be AUX ital-3222 17 14 obtained obtain VERB ital-3222 17 15 in in ADP ital-3222 17 16 two two NUM ital-3222 17 17 ways way NOUN ital-3222 17 18 : : PUNCT ital-3222 17 19 with with ADP ital-3222 17 20 or or CCONJ ital-3222 17 21 without without ADP ital-3222 17 22 regard regard NOUN ital-3222 17 23 to to ADP ital-3222 17 24 their their PRON ital-3222 17 25 format format NOUN ital-3222 17 26 . . PUNCT ital-3222 18 1 the the DET ital-3222 18 2 more more ADV ital-3222 18 3 nontextual nontextual ADJ ital-3222 18 4 content content NOUN ital-3222 18 5 in in ADP ital-3222 18 6 a a DET ital-3222 18 7 document document NOUN ital-3222 18 8 ( ( PUNCT ital-3222 18 9 e.g. e.g. ADV ital-3222 18 10 , , PUNCT ital-3222 18 11 formatting format VERB ital-3222 18 12 instructions instruction NOUN ital-3222 18 13 , , PUNCT ital-3222 18 14 structure structure NOUN ital-3222 18 15 description description NOUN ital-3222 18 16 , , PUNCT ital-3222 18 17 or or CCONJ ital-3222 18 18 embedded embed VERB ital-3222 18 19 images image NOUN ital-3222 18 20 ) ) PUNCT ital-3222 18 21 , , PUNCT ital-3222 18 22 the the PRON ital-3222 18 23 more more ADJ ital-3222 18 24 it it PRON ital-3222 18 25 requires require VERB ital-3222 18 26 format format NOUN ital-3222 18 27 - - PUNCT ital-3222 18 28 specific specific ADJ ital-3222 18 29 processing processing NOUN ital-3222 18 30 to to PART ital-3222 18 31 improve improve VERB ital-3222 18 32 its its PRON ital-3222 18 33 compression compression NOUN ital-3222 18 34 ratio ratio NOUN ital-3222 18 35 . . PUNCT ital-3222 19 1 this this PRON ital-3222 19 2 is be AUX ital-3222 19 3 because because SCONJ ital-3222 19 4 most most ADJ ital-3222 19 5 document document NOUN ital-3222 19 6 formats format NOUN ital-3222 19 7 have have VERB ital-3222 19 8 their their PRON ital-3222 19 9 own own ADJ ital-3222 19 10 ways way NOUN ital-3222 19 11 of of ADP ital-3222 19 12 describing describe VERB ital-3222 19 13 their their PRON ital-3222 19 14 formatting formatting ADJ ital-3222 19 15 , , PUNCT ital-3222 19 16 structure structure NOUN ital-3222 19 17 , , PUNCT ital-3222 19 18 and and CCONJ ital-3222 19 19 nontextual nontextual ADJ ital-3222 19 20 inclusions inclusion NOUN ital-3222 19 21 ( ( PUNCT ital-3222 19 22 plain plain ADJ ital-3222 19 23 text text NOUN ital-3222 19 24 files file NOUN ital-3222 19 25 have have VERB ital-3222 19 26 no no DET ital-3222 19 27 inclusions inclusion NOUN ital-3222 19 28 ) ) PUNCT ital-3222 19 29 . . PUNCT ital-3222 20 1 for for ADP ital-3222 20 2 this this DET ital-3222 20 3 reason reason NOUN ital-3222 20 4 , , PUNCT ital-3222 20 5 we we PRON ital-3222 20 6 have have AUX ital-3222 20 7 developed develop VERB ital-3222 20 8 a a DET ital-3222 20 9 compound compound NOUN ital-3222 20 10 scheme scheme NOUN ital-3222 20 11 that that PRON ital-3222 20 12 consists consist VERB ital-3222 20 13 of of ADP ital-3222 20 14 several several ADJ ital-3222 20 15 subschemes subscheme NOUN ital-3222 20 16 that that PRON ital-3222 20 17 can can AUX ital-3222 20 18 be be AUX ital-3222 20 19 turned turn VERB ital-3222 20 20 on on ADP ital-3222 20 21 and and CCONJ ital-3222 20 22 off off ADP ital-3222 20 23 or or CCONJ ital-3222 20 24 run run VERB ital-3222 20 25 with with ADP ital-3222 20 26 different different ADJ ital-3222 20 27 parameters parameter NOUN ital-3222 20 28 . . PUNCT ital-3222 21 1 the the DET ital-3222 21 2 most most ADV ital-3222 21 3 suitable suitable ADJ ital-3222 21 4 solution solution NOUN ital-3222 21 5 for for ADP ital-3222 21 6 a a DET ital-3222 21 7 given give VERB ital-3222 21 8 document document NOUN ital-3222 21 9 format format NOUN ital-3222 21 10 can can AUX ital-3222 21 11 be be AUX ital-3222 21 12 obtained obtain VERB ital-3222 21 13 by by ADP ital-3222 21 14 merely merely ADV ital-3222 21 15 choosing choose VERB ital-3222 21 16 the the DET ital-3222 21 17 right right ADJ ital-3222 21 18 schemes scheme NOUN ital-3222 21 19 and and CCONJ ital-3222 21 20 adequate adequate ADJ ital-3222 21 21 parameter parameter NOUN ital-3222 21 22 values value NOUN ital-3222 21 23 . . PUNCT ital-3222 22 1 experimentally experimentally ADV ital-3222 22 2 , , PUNCT ital-3222 22 3 we we PRON ital-3222 22 4 have have AUX ital-3222 22 5 found find VERB ital-3222 22 6 the the DET ital-3222 22 7 optimal optimal ADJ ital-3222 22 8 subscheme subscheme NOUN ital-3222 22 9 combinations combination NOUN ital-3222 22 10 for for ADP ital-3222 22 11 the the DET ital-3222 22 12 following follow VERB ital-3222 22 13 formats format NOUN ital-3222 22 14 used use VERB ital-3222 22 15 in in ADP ital-3222 22 16 digital digital ADJ ital-3222 22 17 libraries library NOUN ital-3222 22 18 : : PUNCT ital-3222 22 19 plain plain ADJ ital-3222 22 20 text text NOUN ital-3222 22 21 , , PUNCT ital-3222 22 22 tex tex PROPN ital-3222 22 23 , , PUNCT ital-3222 22 24 rtf rtf NOUN ital-3222 22 25 , , PUNCT ital-3222 22 26 text text NOUN ital-3222 22 27 annotated annotate VERB ital-3222 22 28 with with ADP ital-3222 22 29 xml xml PROPN ital-3222 22 30 , , PUNCT ital-3222 22 31 html html PROPN ital-3222 22 32 , , PUNCT ital-3222 22 33 as as ADV ital-3222 22 34 well well ADV ital-3222 22 35 as as ADP ital-3222 22 36 the the DET ital-3222 22 37 device device NOUN ital-3222 22 38 - - PUNCT ital-3222 22 39 independent independent ADJ ital-3222 22 40 rendering rendering NOUN ital-3222 22 41 formats format NOUN ital-3222 22 42 ps ps PROPN ital-3222 22 43 and and CCONJ ital-3222 22 44 pdf.5 pdf.5 PROPN ital-3222 22 45 first first ADV ital-3222 22 46 we we PRON ital-3222 22 47 discuss discuss VERB ital-3222 22 48 related related ADJ ital-3222 22 49 work work NOUN ital-3222 22 50 in in ADP ital-3222 22 51 text text NOUN ital-3222 22 52 compression compression NOUN ital-3222 22 53 , , PUNCT ital-3222 22 54 then then ADV ital-3222 22 55 describe describe VERB ital-3222 22 56 the the DET ital-3222 22 57 basis basis NOUN ital-3222 22 58 of of ADP ital-3222 22 59 the the DET ital-3222 22 60 proposed propose VERB ital-3222 22 61 scheme scheme NOUN ital-3222 22 62 and and CCONJ ital-3222 22 63 how how SCONJ ital-3222 22 64 it it PRON ital-3222 22 65 should should AUX ital-3222 22 66 be be AUX ital-3222 22 67 adapted adapt VERB ital-3222 22 68 for for ADP ital-3222 22 69 particular particular ADJ ital-3222 22 70 document document NOUN ital-3222 22 71 formats format NOUN ital-3222 22 72 . . PUNCT ital-3222 23 1 the the DET ital-3222 23 2 section section NOUN ital-3222 23 3 “ " PUNCT ital-3222 23 4 using use VERB ital-3222 23 5 the the DET ital-3222 23 6 scheme scheme NOUN ital-3222 23 7 in in ADP ital-3222 23 8 a a DET ital-3222 23 9 digital digital ADJ ital-3222 23 10 library library NOUN ital-3222 23 11 project project NOUN ital-3222 23 12 ” " PUNCT ital-3222 23 13 discusses discuss VERB ital-3222 23 14 how how SCONJ ital-3222 23 15 to to PART ital-3222 23 16 use use VERB ital-3222 23 17 the the DET ital-3222 23 18 free free ADJ ital-3222 23 19 software software NOUN ital-3222 23 20 library library NOUN ital-3222 23 21 that that PRON ital-3222 23 22 implements implement VERB ital-3222 23 23 the the DET ital-3222 23 24 scheme scheme NOUN ital-3222 23 25 . . PUNCT ital-3222 24 1 then then ADV ital-3222 24 2 we we PRON ital-3222 24 3 cover cover VERB ital-3222 24 4 the the DET ital-3222 24 5 results result NOUN ital-3222 24 6 of of ADP ital-3222 24 7 experiments experiment NOUN ital-3222 24 8 involving involve VERB ital-3222 24 9 the the DET ital-3222 24 10 proposed propose VERB ital-3222 24 11 scheme scheme NOUN ital-3222 24 12 and and CCONJ ital-3222 24 13 a a DET ital-3222 24 14 corpus corpus NOUN ital-3222 24 15 of of ADP ital-3222 24 16 test test NOUN ital-3222 24 17 files file NOUN ital-3222 24 18 in in ADP ital-3222 24 19 each each PRON ital-3222 24 20 of of ADP ital-3222 24 21 the the DET ital-3222 24 22 tested test VERB ital-3222 24 23 formats format NOUN ital-3222 24 24 . . PUNCT ital-3222 25 1 n n ADV ital-3222 25 2 text text NOUN ital-3222 25 3 compression compression NOUN ital-3222 25 4 there there PRON ital-3222 25 5 are be VERB ital-3222 25 6 two two NUM ital-3222 25 7 basic basic ADJ ital-3222 25 8 principles principle NOUN ital-3222 25 9 of of ADP ital-3222 25 10 general general ADJ ital-3222 25 11 - - PUNCT ital-3222 25 12 purpose purpose NOUN ital-3222 25 13 data datum NOUN ital-3222 25 14 compression compression NOUN ital-3222 25 15 . . PUNCT ital-3222 26 1 the the DET ital-3222 26 2 first first ADJ ital-3222 26 3 one one NUM ital-3222 26 4 works work VERB ital-3222 26 5 on on ADP ital-3222 26 6 the the DET ital-3222 26 7 level level NOUN ital-3222 26 8 of of ADP ital-3222 26 9 character character NOUN ital-3222 26 10 sequences sequence NOUN ital-3222 26 11 , , PUNCT ital-3222 26 12 the the DET ital-3222 26 13 second second ADJ ital-3222 26 14 one one NOUN ital-3222 26 15 works work VERB ital-3222 26 16 on on ADP ital-3222 26 17 the the DET ital-3222 26 18 level level NOUN ital-3222 26 19 of of ADP ital-3222 26 20 przemysław przemysław ADJ ital-3222 26 21 skibiński skibiński NOUN ital-3222 26 22 ( ( PUNCT ital-3222 26 23 inikep@ii.uni.wroc.pl inikep@ii.uni.wroc.pl X ital-3222 26 24 ) ) PUNCT ital-3222 26 25 is be AUX ital-3222 26 26 associate associate ADJ ital-3222 26 27 professor professor NOUN ital-3222 26 28 , , PUNCT ital-3222 26 29 institute institute PROPN ital-3222 26 30 of of ADP ital-3222 26 31 computer computer NOUN ital-3222 26 32 science science NOUN ital-3222 26 33 , , PUNCT ital-3222 26 34 university university NOUN ital-3222 26 35 of of ADP ital-3222 26 36 wrocław wrocław PROPN ital-3222 26 37 , , PUNCT ital-3222 26 38 poland poland PROPN ital-3222 26 39 . . PUNCT ital-3222 27 1 jakub jakub PROPN ital-3222 27 2 swacha swacha PROPN ital-3222 27 3 ( ( PUNCT ital-3222 27 4 jakubs@uoo.univ.szczecin jakubs@uoo.univ.szczecin NOUN ital-3222 27 5 .pl .pl X ital-3222 27 6 ) ) PUNCT ital-3222 27 7 is be AUX ital-3222 27 8 associate associate ADJ ital-3222 27 9 professor professor NOUN ital-3222 27 10 , , PUNCT ital-3222 27 11 institute institute PROPN ital-3222 27 12 of of ADP ital-3222 27 13 information information NOUN ital-3222 27 14 technology technology NOUN ital-3222 27 15 in in ADP ital-3222 27 16 management management NOUN ital-3222 27 17 , , PUNCT ital-3222 27 18 university university PROPN ital-3222 27 19 of of ADP ital-3222 27 20 szczecin szczecin PROPN ital-3222 27 21 , , PUNCT ital-3222 27 22 poland poland PROPN ital-3222 27 23 . . PUNCT ital-3222 28 1 144 144 NUM ital-3222 28 2 information information NOUN ital-3222 28 3 technology technology NOUN ital-3222 28 4 and and CCONJ ital-3222 28 5 libraries library NOUN ital-3222 28 6 | | NOUN ital-3222 28 7 september september PROPN ital-3222 28 8 2009 2009 NUM ital-3222 28 9 individual individual ADJ ital-3222 28 10 characters character NOUN ital-3222 28 11 . . PUNCT ital-3222 29 1 in in ADP ital-3222 29 2 the the DET ital-3222 29 3 first first ADJ ital-3222 29 4 case case NOUN ital-3222 29 5 , , PUNCT ital-3222 29 6 the the DET ital-3222 29 7 idea idea NOUN ital-3222 29 8 is be AUX ital-3222 29 9 to to PART ital-3222 29 10 look look VERB ital-3222 29 11 for for ADP ital-3222 29 12 matching match VERB ital-3222 29 13 character character NOUN ital-3222 29 14 sequences sequence NOUN ital-3222 29 15 in in ADP ital-3222 29 16 the the DET ital-3222 29 17 past past ADJ ital-3222 29 18 buffer buffer NOUN ital-3222 29 19 of of ADP ital-3222 29 20 the the DET ital-3222 29 21 file file NOUN ital-3222 29 22 being be AUX ital-3222 29 23 compressed compress VERB ital-3222 29 24 and and CCONJ ital-3222 29 25 replace replace VERB ital-3222 29 26 such such ADJ ital-3222 29 27 sequences sequence NOUN ital-3222 29 28 with with ADP ital-3222 29 29 shorter short ADJ ital-3222 29 30 code code NOUN ital-3222 29 31 words word NOUN ital-3222 29 32 ; ; PUNCT ital-3222 29 33 this this DET ital-3222 29 34 principle principle NOUN ital-3222 29 35 underlies underlie VERB ital-3222 29 36 the the DET ital-3222 29 37 algorithms algorithm NOUN ital-3222 29 38 derived derive VERB ital-3222 29 39 from from ADP ital-3222 29 40 the the DET ital-3222 29 41 concepts concept NOUN ital-3222 29 42 of of ADP ital-3222 29 43 arbraham arbraham PROPN ital-3222 29 44 lempel lempel PROPN ital-3222 29 45 and and CCONJ ital-3222 29 46 jacob jacob PROPN ital-3222 29 47 ziv ziv PROPN ital-3222 29 48 ( ( PUNCT ital-3222 29 49 lz lz PROPN ital-3222 29 50 - - PROPN ital-3222 29 51 type).6 type).6 PROPN ital-3222 29 52 in in ADP ital-3222 29 53 the the DET ital-3222 29 54 second second ADJ ital-3222 29 55 case case NOUN ital-3222 29 56 , , PUNCT ital-3222 29 57 the the DET ital-3222 29 58 idea idea NOUN ital-3222 29 59 is be AUX ital-3222 29 60 to to PART ital-3222 29 61 gather gather VERB ital-3222 29 62 frequency frequency NOUN ital-3222 29 63 statistics statistic NOUN ital-3222 29 64 for for ADP ital-3222 29 65 characters character NOUN ital-3222 29 66 in in ADP ital-3222 29 67 the the DET ital-3222 29 68 file file NOUN ital-3222 29 69 being be AUX ital-3222 29 70 compressed compress VERB ital-3222 29 71 and and CCONJ ital-3222 29 72 then then ADV ital-3222 29 73 assign assign VERB ital-3222 29 74 shorter short ADJ ital-3222 29 75 code code NOUN ital-3222 29 76 words word NOUN ital-3222 29 77 for for ADP ital-3222 29 78 frequent frequent ADJ ital-3222 29 79 characters character NOUN ital-3222 29 80 and and CCONJ ital-3222 29 81 longer long ADJ ital-3222 29 82 ones one NOUN ital-3222 29 83 for for ADP ital-3222 29 84 rare rare ADJ ital-3222 29 85 characters character NOUN ital-3222 29 86 ( ( PUNCT ital-3222 29 87 this this PRON ital-3222 29 88 is be AUX ital-3222 29 89 exactly exactly ADV ital-3222 29 90 how how SCONJ ital-3222 29 91 huffman huffman ADJ ital-3222 29 92 coding code VERB ital-3222 29 93 works work NOUN ital-3222 29 94 — — PUNCT ital-3222 29 95 what what PRON ital-3222 29 96 arithmetic arithmetic ADJ ital-3222 29 97 coding coding ADJ ital-3222 29 98 assigns assign NOUN ital-3222 29 99 are be AUX ital-3222 29 100 value value NOUN ital-3222 29 101 ranges range VERB ital-3222 29 102 rather rather ADV ital-3222 29 103 than than ADP ital-3222 29 104 individual individual ADJ ital-3222 29 105 code code NOUN ital-3222 29 106 words).7 words).7 PROPN ital-3222 29 107 as as SCONJ ital-3222 29 108 the the DET ital-3222 29 109 characters character NOUN ital-3222 29 110 form form VERB ital-3222 29 111 words word NOUN ital-3222 29 112 , , PUNCT ital-3222 29 113 and and CCONJ ital-3222 29 114 words word NOUN ital-3222 29 115 form form VERB ital-3222 29 116 phrases phrase NOUN ital-3222 29 117 , , PUNCT ital-3222 29 118 there there PRON ital-3222 29 119 is be VERB ital-3222 29 120 high high ADJ ital-3222 29 121 correlation correlation NOUN ital-3222 29 122 between between ADP ital-3222 29 123 subsequent subsequent ADJ ital-3222 29 124 characters character NOUN ital-3222 29 125 . . PUNCT ital-3222 30 1 to to PART ital-3222 30 2 produce produce VERB ital-3222 30 3 shorter short ADJ ital-3222 30 4 code code NOUN ital-3222 30 5 words word NOUN ital-3222 30 6 , , PUNCT ital-3222 30 7 a a DET ital-3222 30 8 compression compression NOUN ital-3222 30 9 algorithm algorithm NOUN ital-3222 30 10 either either CCONJ ital-3222 30 11 has have VERB ital-3222 30 12 to to PART ital-3222 30 13 observe observe VERB ital-3222 30 14 the the DET ital-3222 30 15 context context NOUN ital-3222 30 16 ( ( PUNCT ital-3222 30 17 understood understand VERB ital-3222 30 18 as as ADP ital-3222 30 19 several several ADJ ital-3222 30 20 preceding precede VERB ital-3222 30 21 characters character NOUN ital-3222 30 22 ) ) PUNCT ital-3222 30 23 in in ADP ital-3222 30 24 which which PRON ital-3222 30 25 the the DET ital-3222 30 26 character character NOUN ital-3222 30 27 appeared appear VERB ital-3222 30 28 and and CCONJ ital-3222 30 29 maintain maintain VERB ital-3222 30 30 separate separate ADJ ital-3222 30 31 frequency frequency NOUN ital-3222 30 32 models model NOUN ital-3222 30 33 for for ADP ital-3222 30 34 different different ADJ ital-3222 30 35 contexts context NOUN ital-3222 30 36 , , PUNCT ital-3222 30 37 or or CCONJ ital-3222 30 38 has have VERB ital-3222 30 39 to to PART ital-3222 30 40 first first ADV ital-3222 30 41 decorrelate decorrelate VERB ital-3222 30 42 the the DET ital-3222 30 43 characters character NOUN ital-3222 30 44 ( ( PUNCT ital-3222 30 45 by by ADP ital-3222 30 46 sorting sort VERB ital-3222 30 47 them they PRON ital-3222 30 48 according accord VERB ital-3222 30 49 to to ADP ital-3222 30 50 their their PRON ital-3222 30 51 contexts context NOUN ital-3222 30 52 ) ) PUNCT ital-3222 30 53 and and CCONJ ital-3222 30 54 then then ADV ital-3222 30 55 use use VERB ital-3222 30 56 an an DET ital-3222 30 57 adaptive adaptive ADJ ital-3222 30 58 frequency frequency NOUN ital-3222 30 59 model model NOUN ital-3222 30 60 when when SCONJ ital-3222 30 61 compressing compress VERB ital-3222 30 62 the the DET ital-3222 30 63 output output NOUN ital-3222 30 64 ( ( PUNCT ital-3222 30 65 as as SCONJ ital-3222 30 66 the the DET ital-3222 30 67 characters character NOUN ital-3222 30 68 ’ ’ PART ital-3222 30 69 dependence dependence NOUN ital-3222 30 70 on on ADP ital-3222 30 71 context context NOUN ital-3222 30 72 becomes become VERB ital-3222 30 73 dependence dependence NOUN ital-3222 30 74 on on ADP ital-3222 30 75 position position NOUN ital-3222 30 76 ) ) PUNCT ital-3222 30 77 . . PUNCT ital-3222 31 1 whereas whereas SCONJ ital-3222 31 2 the the DET ital-3222 31 3 former former ADJ ital-3222 31 4 solution solution NOUN ital-3222 31 5 is be AUX ital-3222 31 6 the the DET ital-3222 31 7 foundation foundation NOUN ital-3222 31 8 of of ADP ital-3222 31 9 prediction prediction NOUN ital-3222 31 10 by by ADP ital-3222 31 11 partial partial ADJ ital-3222 31 12 match match NOUN ital-3222 31 13 ( ( PUNCT ital-3222 31 14 ppm ppm ADJ ital-3222 31 15 ) ) PUNCT ital-3222 31 16 algorithms algorithm NOUN ital-3222 31 17 , , PUNCT ital-3222 31 18 burrows burrow NOUN ital-3222 31 19 - - PUNCT ital-3222 31 20 wheeler wheeler NOUN ital-3222 31 21 transform transform NOUN ital-3222 31 22 ( ( PUNCT ital-3222 31 23 bwt bwt PROPN ital-3222 31 24 ) ) PUNCT ital-3222 31 25 compression compression NOUN ital-3222 31 26 algorithms algorithm NOUN ital-3222 31 27 are be AUX ital-3222 31 28 based base VERB ital-3222 31 29 on on ADP ital-3222 31 30 the the DET ital-3222 31 31 latter.8 latter.8 PROPN ital-3222 31 32 witten witten ADJ ital-3222 31 33 et et PROPN ital-3222 31 34 al al PROPN ital-3222 31 35 . . PROPN ital-3222 31 36 , , PUNCT ital-3222 31 37 in in ADP ital-3222 31 38 their their PRON ital-3222 31 39 seminal seminal ADJ ital-3222 31 40 work work NOUN ital-3222 31 41 managing managing NOUN ital-3222 31 42 gigabytes gigabyte NOUN ital-3222 31 43 , , PUNCT ital-3222 31 44 emphasize emphasize VERB ital-3222 31 45 the the DET ital-3222 31 46 role role NOUN ital-3222 31 47 of of ADP ital-3222 31 48 data data NOUN ital-3222 31 49 compression compression NOUN ital-3222 31 50 in in ADP ital-3222 31 51 text text NOUN ital-3222 31 52 storage storage NOUN ital-3222 31 53 and and CCONJ ital-3222 31 54 retrieval retrieval NOUN ital-3222 31 55 systems system NOUN ital-3222 31 56 , , PUNCT ital-3222 31 57 stating state VERB ital-3222 31 58 three three NUM ital-3222 31 59 requirements requirement NOUN ital-3222 31 60 for for ADP ital-3222 31 61 the the DET ital-3222 31 62 compression compression NOUN ital-3222 31 63 process process NOUN ital-3222 31 64 : : PUNCT ital-3222 31 65 good good ADJ ital-3222 31 66 compression compression NOUN ital-3222 31 67 , , PUNCT ital-3222 31 68 fast fast ADJ ital-3222 31 69 decoding decoding NOUN ital-3222 31 70 , , PUNCT ital-3222 31 71 and and CCONJ ital-3222 31 72 feasibility feasibility NOUN ital-3222 31 73 of of ADP ital-3222 31 74 decoding decode VERB ital-3222 31 75 individual individual ADJ ital-3222 31 76 documents document NOUN ital-3222 31 77 with with ADP ital-3222 31 78 minimum minimum ADJ ital-3222 31 79 overhead.9 overhead.9 NOUN ital-3222 31 80 the the DET ital-3222 31 81 choice choice NOUN ital-3222 31 82 of of ADP ital-3222 31 83 compression compression NOUN ital-3222 31 84 algorithm algorithm NOUN ital-3222 31 85 should should AUX ital-3222 31 86 depend depend VERB ital-3222 31 87 on on ADP ital-3222 31 88 what what PRON ital-3222 31 89 is be AUX ital-3222 31 90 more more ADV ital-3222 31 91 important important ADJ ital-3222 31 92 for for ADP ital-3222 31 93 a a DET ital-3222 31 94 specific specific ADJ ital-3222 31 95 application application NOUN ital-3222 31 96 : : PUNCT ital-3222 31 97 better well ADJ ital-3222 31 98 compression compression NOUN ital-3222 31 99 or or CCONJ ital-3222 31 100 faster fast ADV ital-3222 31 101 decoding decoding ADJ ital-3222 31 102 . . PUNCT ital-3222 32 1 an an DET ital-3222 32 2 early early ADJ ital-3222 32 3 work work NOUN ital-3222 32 4 of of ADP ital-3222 32 5 jon jon PROPN ital-3222 32 6 louis louis PROPN ital-3222 32 7 bentley bentley PROPN ital-3222 32 8 and and CCONJ ital-3222 32 9 others other NOUN ital-3222 32 10 showed show VERB ital-3222 32 11 that that SCONJ ital-3222 32 12 a a DET ital-3222 32 13 significant significant ADJ ital-3222 32 14 improvement improvement NOUN ital-3222 32 15 in in ADP ital-3222 32 16 text text NOUN ital-3222 32 17 compression compression NOUN ital-3222 32 18 can can AUX ital-3222 32 19 be be AUX ital-3222 32 20 achieved achieve VERB ital-3222 32 21 by by ADP ital-3222 32 22 treating treat VERB ital-3222 32 23 a a DET ital-3222 32 24 text text NOUN ital-3222 32 25 document document NOUN ital-3222 32 26 as as ADP ital-3222 32 27 a a DET ital-3222 32 28 stream stream NOUN ital-3222 32 29 of of ADP ital-3222 32 30 space space NOUN ital-3222 32 31 - - PUNCT ital-3222 32 32 delimited delimit VERB ital-3222 32 33 words word NOUN ital-3222 32 34 rather rather ADV ital-3222 32 35 than than ADP ital-3222 32 36 individual individual ADJ ital-3222 32 37 characters.10 characters.10 NOUN ital-3222 32 38 this this DET ital-3222 32 39 technique technique NOUN ital-3222 32 40 can can AUX ital-3222 32 41 be be AUX ital-3222 32 42 combined combine VERB ital-3222 32 43 with with ADP ital-3222 32 44 any any DET ital-3222 32 45 general general ADJ ital-3222 32 46 - - PUNCT ital-3222 32 47 purpose purpose NOUN ital-3222 32 48 compression compression NOUN ital-3222 32 49 method method NOUN ital-3222 32 50 in in ADP ital-3222 32 51 two two NUM ital-3222 32 52 ways way NOUN ital-3222 32 53 : : PUNCT ital-3222 32 54 by by ADP ital-3222 32 55 redesigning redesign VERB ital-3222 32 56 character character NOUN ital-3222 32 57 - - PUNCT ital-3222 32 58 based base VERB ital-3222 32 59 algorithms algorithm NOUN ital-3222 32 60 as as ADP ital-3222 32 61 word word NOUN ital-3222 32 62 - - PUNCT ital-3222 32 63 based base VERB ital-3222 32 64 ones one NOUN ital-3222 32 65 or or CCONJ ital-3222 32 66 by by ADP ital-3222 32 67 implementing implement VERB ital-3222 32 68 a a DET ital-3222 32 69 two two NUM ital-3222 32 70 - - PUNCT ital-3222 32 71 stage stage NOUN ital-3222 32 72 scheme scheme NOUN ital-3222 32 73 whose whose DET ital-3222 32 74 first first ADJ ital-3222 32 75 step step NOUN ital-3222 32 76 is be AUX ital-3222 32 77 a a DET ital-3222 32 78 transform transform NOUN ital-3222 32 79 replacing replace VERB ital-3222 32 80 words word NOUN ital-3222 32 81 with with ADP ital-3222 32 82 dictionary dictionary ADJ ital-3222 32 83 indices index NOUN ital-3222 32 84 and and CCONJ ital-3222 32 85 whose whose DET ital-3222 32 86 second second ADJ ital-3222 32 87 step step NOUN ital-3222 32 88 is be AUX ital-3222 32 89 passing pass VERB ital-3222 32 90 the the DET ital-3222 32 91 transformed transform VERB ital-3222 32 92 text text NOUN ital-3222 32 93 through through ADP ital-3222 32 94 any any DET ital-3222 32 95 generalpurpose generalpurpose NOUN ital-3222 32 96 compressor.11 compressor.11 NOUN ital-3222 32 97 from from ADP ital-3222 32 98 the the DET ital-3222 32 99 designer designer NOUN ital-3222 32 100 ’s ’s PART ital-3222 32 101 point point NOUN ital-3222 32 102 of of ADP ital-3222 32 103 view view NOUN ital-3222 32 104 , , PUNCT ital-3222 32 105 although although SCONJ ital-3222 32 106 the the DET ital-3222 32 107 first first ADJ ital-3222 32 108 approach approach NOUN ital-3222 32 109 provides provide VERB ital-3222 32 110 more more ADJ ital-3222 32 111 control control NOUN ital-3222 32 112 over over ADP ital-3222 32 113 how how SCONJ ital-3222 32 114 the the DET ital-3222 32 115 text text NOUN ital-3222 32 116 is be AUX ital-3222 32 117 modeled model VERB ital-3222 32 118 , , PUNCT ital-3222 32 119 the the DET ital-3222 32 120 second second ADJ ital-3222 32 121 approach approach NOUN ital-3222 32 122 is be AUX ital-3222 32 123 much much ADV ital-3222 32 124 easier easy ADJ ital-3222 32 125 to to PART ital-3222 32 126 implement implement VERB ital-3222 32 127 and and CCONJ ital-3222 32 128 upgrade upgrade VERB ital-3222 32 129 to to ADP ital-3222 32 130 future future ADJ ital-3222 32 131 general general ADJ ital-3222 32 132 - - PUNCT ital-3222 32 133 purpose purpose NOUN ital-3222 32 134 compressors.12 compressors.12 NOUN ital-3222 32 135 notice notice VERB ital-3222 32 136 that that SCONJ ital-3222 32 137 the the DET ital-3222 32 138 separation separation NOUN ital-3222 32 139 of of ADP ital-3222 32 140 the the DET ital-3222 32 141 wordreplacement wordreplacement NOUN ital-3222 32 142 stage stage NOUN ital-3222 32 143 from from ADP ital-3222 32 144 the the DET ital-3222 32 145 compression compression NOUN ital-3222 32 146 stage stage NOUN ital-3222 32 147 does do AUX ital-3222 32 148 not not PART ital-3222 32 149 imply imply VERB ital-3222 32 150 that that SCONJ ital-3222 32 151 two two NUM ital-3222 32 152 distinct distinct ADJ ital-3222 32 153 programs program NOUN ital-3222 32 154 have have VERB ital-3222 32 155 to to PART ital-3222 32 156 be be AUX ital-3222 32 157 used use VERB ital-3222 32 158 — — PUNCT ital-3222 32 159 if if SCONJ ital-3222 32 160 only only ADV ital-3222 32 161 an an DET ital-3222 32 162 appropriate appropriate ADJ ital-3222 32 163 general general ADJ ital-3222 32 164 - - PUNCT ital-3222 32 165 purpose purpose NOUN ital-3222 32 166 compression compression NOUN ital-3222 32 167 software software NOUN ital-3222 32 168 library library NOUN ital-3222 32 169 is be AUX ital-3222 32 170 available available ADJ ital-3222 32 171 , , PUNCT ital-3222 32 172 a a DET ital-3222 32 173 single single ADJ ital-3222 32 174 utility utility NOUN ital-3222 32 175 can can AUX ital-3222 32 176 use use VERB ital-3222 32 177 it it PRON ital-3222 32 178 to to PART ital-3222 32 179 compress compress VERB ital-3222 32 180 the the DET ital-3222 32 181 output output NOUN ital-3222 32 182 of of ADP ital-3222 32 183 the the DET ital-3222 32 184 transform transform NOUN ital-3222 32 185 it it PRON ital-3222 32 186 first first ADV ital-3222 32 187 performed perform VERB ital-3222 32 188 . . PUNCT ital-3222 33 1 an an DET ital-3222 33 2 important important ADJ ital-3222 33 3 element element NOUN ital-3222 33 4 of of ADP ital-3222 33 5 every every DET ital-3222 33 6 word word NOUN ital-3222 33 7 - - PUNCT ital-3222 33 8 based base VERB ital-3222 33 9 scheme scheme NOUN ital-3222 33 10 is be AUX ital-3222 33 11 the the DET ital-3222 33 12 dictionary dictionary NOUN ital-3222 33 13 of of ADP ital-3222 33 14 words word NOUN ital-3222 33 15 that that PRON ital-3222 33 16 lists list VERB ital-3222 33 17 character character NOUN ital-3222 33 18 sequences sequence NOUN ital-3222 33 19 that that PRON ital-3222 33 20 should should AUX ital-3222 33 21 be be AUX ital-3222 33 22 treated treat VERB ital-3222 33 23 as as ADP ital-3222 33 24 single single ADJ ital-3222 33 25 entities entity NOUN ital-3222 33 26 . . PUNCT ital-3222 34 1 the the DET ital-3222 34 2 dictionary dictionary NOUN ital-3222 34 3 can can AUX ital-3222 34 4 be be AUX ital-3222 34 5 dynamic dynamic ADJ ital-3222 34 6 ( ( PUNCT ital-3222 34 7 i.e. i.e. X ital-3222 34 8 , , PUNCT ital-3222 34 9 constructed construct VERB ital-3222 34 10 on on ADP ital-3222 34 11 - - PUNCT ital-3222 34 12 line line NOUN ital-3222 34 13 during during ADP ital-3222 34 14 the the DET ital-3222 34 15 compression compression NOUN ital-3222 34 16 of of ADP ital-3222 34 17 every every DET ital-3222 34 18 document),13 document),13 NOUN ital-3222 34 19 static static NOUN ital-3222 34 20 ( ( PUNCT ital-3222 34 21 i.e. i.e. X ital-3222 34 22 , , PUNCT ital-3222 34 23 constructed construct VERB ital-3222 34 24 off off ADP ital-3222 34 25 - - PUNCT ital-3222 34 26 line line NOUN ital-3222 34 27 before before ADP ital-3222 34 28 the the DET ital-3222 34 29 compression compression NOUN ital-3222 34 30 stage stage NOUN ital-3222 34 31 and and CCONJ ital-3222 34 32 once once ADV ital-3222 34 33 for for ADP ital-3222 34 34 every every DET ital-3222 34 35 document document NOUN ital-3222 34 36 of of ADP ital-3222 34 37 a a DET ital-3222 34 38 given give VERB ital-3222 34 39 class class NOUN ital-3222 34 40 — — PUNCT ital-3222 34 41 typically typically ADV ital-3222 34 42 , , PUNCT ital-3222 34 43 the the DET ital-3222 34 44 language language NOUN ital-3222 34 45 of of ADP ital-3222 34 46 the the DET ital-3222 34 47 document document NOUN ital-3222 34 48 determines determine VERB ital-3222 34 49 its its PRON ital-3222 34 50 class),14 class),14 NOUN ital-3222 34 51 or or CCONJ ital-3222 34 52 semidynamic semidynamic ADJ ital-3222 34 53 ( ( PUNCT ital-3222 34 54 i.e. i.e. X ital-3222 34 55 , , PUNCT ital-3222 34 56 constructed construct VERB ital-3222 34 57 off off ADP ital-3222 34 58 - - PUNCT ital-3222 34 59 line line NOUN ital-3222 34 60 before before ADP ital-3222 34 61 compression compression NOUN ital-3222 34 62 stage stage NOUN ital-3222 34 63 but but CCONJ ital-3222 34 64 individually individually ADV ital-3222 34 65 for for SCONJ ital-3222 34 66 every every DET ital-3222 34 67 document).15 document).15 ADJ ital-3222 34 68 semidynamic semidynamic ADJ ital-3222 34 69 dictionaries dictionary NOUN ital-3222 34 70 must must AUX ital-3222 34 71 be be AUX ital-3222 34 72 stored store VERB ital-3222 34 73 along along ADP ital-3222 34 74 with with ADP ital-3222 34 75 the the DET ital-3222 34 76 compressed compressed ADJ ital-3222 34 77 document document NOUN ital-3222 34 78 . . PUNCT ital-3222 35 1 dynamic dynamic ADJ ital-3222 35 2 dictionaries dictionary NOUN ital-3222 35 3 are be AUX ital-3222 35 4 reconstructed reconstruct VERB ital-3222 35 5 during during ADP ital-3222 35 6 decompression decompression NOUN ital-3222 35 7 ( ( PUNCT ital-3222 35 8 which which PRON ital-3222 35 9 makes make VERB ital-3222 35 10 the the DET ital-3222 35 11 decoding decode VERB ital-3222 35 12 slower slow ADV ital-3222 35 13 than than ADP ital-3222 35 14 in in ADP ital-3222 35 15 the the DET ital-3222 35 16 other other ADJ ital-3222 35 17 cases case NOUN ital-3222 35 18 ) ) PUNCT ital-3222 35 19 . . PUNCT ital-3222 36 1 when when SCONJ ital-3222 36 2 the the DET ital-3222 36 3 static static ADJ ital-3222 36 4 dictionary dictionary NOUN ital-3222 36 5 is be AUX ital-3222 36 6 used use VERB ital-3222 36 7 , , PUNCT ital-3222 36 8 it it PRON ital-3222 36 9 must must AUX ital-3222 36 10 be be AUX ital-3222 36 11 distributed distribute VERB ital-3222 36 12 with with ADP ital-3222 36 13 the the DET ital-3222 36 14 decoder decoder NOUN ital-3222 36 15 ; ; PUNCT ital-3222 36 16 since since SCONJ ital-3222 36 17 a a DET ital-3222 36 18 single single ADJ ital-3222 36 19 dictionary dictionary NOUN ital-3222 36 20 is be AUX ital-3222 36 21 used use VERB ital-3222 36 22 to to PART ital-3222 36 23 compress compress VERB ital-3222 36 24 multiple multiple ADJ ital-3222 36 25 files file NOUN ital-3222 36 26 , , PUNCT ital-3222 36 27 it it PRON ital-3222 36 28 usually usually ADV ital-3222 36 29 attains attain VERB ital-3222 36 30 the the DET ital-3222 36 31 best good ADJ ital-3222 36 32 compression compression NOUN ital-3222 36 33 ratios ratio NOUN ital-3222 36 34 , , PUNCT ital-3222 36 35 but but CCONJ ital-3222 36 36 it it PRON ital-3222 36 37 is be AUX ital-3222 36 38 only only ADV ital-3222 36 39 effective effective ADJ ital-3222 36 40 with with ADP ital-3222 36 41 documents document NOUN ital-3222 36 42 of of ADP ital-3222 36 43 the the DET ital-3222 36 44 class class NOUN ital-3222 36 45 it it PRON ital-3222 36 46 was be AUX ital-3222 36 47 originally originally ADV ital-3222 36 48 prepared prepared ADJ ital-3222 36 49 for for ADP ital-3222 36 50 . . PUNCT ital-3222 37 1 n n ADV ital-3222 37 2 the the DET ital-3222 37 3 basic basic ADJ ital-3222 37 4 compression compression NOUN ital-3222 37 5 scheme scheme NOUN ital-3222 37 6 the the DET ital-3222 37 7 basis basis NOUN ital-3222 37 8 of of ADP ital-3222 37 9 our our PRON ital-3222 37 10 approach approach NOUN ital-3222 37 11 is be AUX ital-3222 37 12 a a DET ital-3222 37 13 word word NOUN ital-3222 37 14 - - PUNCT ital-3222 37 15 based base VERB ital-3222 37 16 , , PUNCT ital-3222 37 17 lossless lossless NOUN ital-3222 37 18 text text NOUN ital-3222 37 19 compression compression NOUN ital-3222 37 20 scheme scheme NOUN ital-3222 37 21 , , PUNCT ital-3222 37 22 dubbed dub VERB ital-3222 37 23 compression compression NOUN ital-3222 37 24 for for ADP ital-3222 37 25 textual textual ADJ ital-3222 37 26 digital digital ADJ ital-3222 37 27 libraries library NOUN ital-3222 37 28 ( ( PUNCT ital-3222 37 29 ctdl ctdl NOUN ital-3222 37 30 ) ) PUNCT ital-3222 37 31 . . PUNCT ital-3222 38 1 the the DET ital-3222 38 2 scheme scheme NOUN ital-3222 38 3 consists consist VERB ital-3222 38 4 of of ADP ital-3222 38 5 up up ADP ital-3222 38 6 to to PART ital-3222 38 7 four four NUM ital-3222 38 8 stages stage NOUN ital-3222 38 9 : : PUNCT ital-3222 38 10 1 1 X ital-3222 38 11 . . X ital-3222 38 12 document document NOUN ital-3222 38 13 decompression decompression NOUN ital-3222 38 14 2 2 NUM ital-3222 38 15 . . PUNCT ital-3222 38 16 dictionary dictionary ADJ ital-3222 38 17 composition composition NOUN ital-3222 38 18 3 3 NUM ital-3222 38 19 . . NOUN ital-3222 38 20 text text NOUN ital-3222 38 21 transform transform NOUN ital-3222 38 22 4 4 NUM ital-3222 38 23 . . PUNCT ital-3222 38 24 compression compression NOUN ital-3222 38 25 stages stage NOUN ital-3222 38 26 1–2 1–2 PRON ital-3222 38 27 are be AUX ital-3222 38 28 optional optional ADJ ital-3222 38 29 . . PUNCT ital-3222 39 1 the the DET ital-3222 39 2 first first ADJ ital-3222 39 3 is be AUX ital-3222 39 4 for for ADP ital-3222 39 5 retrieving retrieve VERB ital-3222 39 6 textual textual ADJ ital-3222 39 7 content content NOUN ital-3222 39 8 from from ADP ital-3222 39 9 files file NOUN ital-3222 39 10 compressed compress VERB ital-3222 39 11 poorly poorly ADV ital-3222 39 12 with with ADP ital-3222 39 13 generalpurpose generalpurpose ADJ ital-3222 39 14 methods method NOUN ital-3222 39 15 . . PUNCT ital-3222 40 1 it it PRON ital-3222 40 2 is be AUX ital-3222 40 3 only only ADV ital-3222 40 4 executed execute VERB ital-3222 40 5 for for ADP ital-3222 40 6 compressed compress VERB ital-3222 40 7 input input NOUN ital-3222 40 8 documents document NOUN ital-3222 40 9 . . PUNCT ital-3222 41 1 it it PRON ital-3222 41 2 uses use VERB ital-3222 41 3 an an DET ital-3222 41 4 embedded embed VERB ital-3222 41 5 decompressor decompressor NOUN ital-3222 41 6 for for ADP ital-3222 41 7 files file NOUN ital-3222 41 8 compressed compress VERB ital-3222 41 9 using use VERB ital-3222 41 10 the the DET ital-3222 41 11 deflate deflate ADJ ital-3222 41 12 algorithm,16 algorithm,16 NOUN ital-3222 41 13 but but CCONJ ital-3222 41 14 an an DET ital-3222 41 15 external external ADJ ital-3222 41 16 tool tool NOUN ital-3222 41 17 — — PUNCT ital-3222 41 18 precomp precomp NOUN ital-3222 41 19 — — PUNCT ital-3222 41 20 is be AUX ital-3222 41 21 used use VERB ital-3222 41 22 to to PART ital-3222 41 23 decode decode VERB ital-3222 41 24 natively natively ADV ital-3222 41 25 compressed compress VERB ital-3222 41 26 pdf pdf PROPN ital-3222 41 27 documents.17 documents.17 PROPN ital-3222 41 28 the the DET ital-3222 41 29 second second ADJ ital-3222 41 30 stage stage NOUN ital-3222 41 31 is be AUX ital-3222 41 32 for for ADP ital-3222 41 33 constructing construct VERB ital-3222 41 34 the the DET ital-3222 41 35 dictionary dictionary NOUN ital-3222 41 36 of of ADP ital-3222 41 37 the the DET ital-3222 41 38 most most ADV ital-3222 41 39 frequent frequent ADJ ital-3222 41 40 words word NOUN ital-3222 41 41 in in ADP ital-3222 41 42 the the DET ital-3222 41 43 processed process VERB ital-3222 41 44 document document NOUN ital-3222 41 45 . . PUNCT ital-3222 42 1 doing do VERB ital-3222 42 2 so so ADV ital-3222 42 3 is be AUX ital-3222 42 4 a a DET ital-3222 42 5 good good ADJ ital-3222 42 6 idea idea NOUN ital-3222 42 7 when when SCONJ ital-3222 42 8 the the DET ital-3222 42 9 compressed compress VERB ital-3222 42 10 documents document NOUN ital-3222 42 11 have have VERB ital-3222 42 12 no no DET ital-3222 42 13 common common ADJ ital-3222 42 14 set set NOUN ital-3222 42 15 of of ADP ital-3222 42 16 words word NOUN ital-3222 42 17 . . PUNCT ital-3222 43 1 if if SCONJ ital-3222 43 2 there there PRON ital-3222 43 3 are be VERB ital-3222 43 4 many many ADJ ital-3222 43 5 documents document NOUN ital-3222 43 6 in in ADP ital-3222 43 7 the the DET ital-3222 43 8 same same ADJ ital-3222 43 9 language language NOUN ital-3222 43 10 , , PUNCT ital-3222 43 11 a a DET ital-3222 43 12 common common ADJ ital-3222 43 13 dictionary dictionary ADJ ital-3222 43 14 fares fare NOUN ital-3222 43 15 better well ADV ital-3222 43 16 — — PUNCT ital-3222 43 17 it it PRON ital-3222 43 18 usually usually ADV ital-3222 43 19 does do AUX ital-3222 43 20 not not PART ital-3222 43 21 pay pay VERB ital-3222 43 22 off off ADP ital-3222 43 23 to to PART ital-3222 43 24 store store VERB ital-3222 43 25 an an DET ital-3222 43 26 individual individual ADJ ital-3222 43 27 dictionary dictionary NOUN ital-3222 43 28 with with ADP ital-3222 43 29 each each DET ital-3222 43 30 file file NOUN ital-3222 43 31 because because SCONJ ital-3222 43 32 they they PRON ital-3222 43 33 all all PRON ital-3222 43 34 contain contain VERB ital-3222 43 35 similar similar ADJ ital-3222 43 36 lists list NOUN ital-3222 43 37 of of ADP ital-3222 43 38 words word NOUN ital-3222 43 39 . . PUNCT ital-3222 44 1 for for ADP ital-3222 44 2 this this DET ital-3222 44 3 reason reason NOUN ital-3222 44 4 we we PRON ital-3222 44 5 have have AUX ital-3222 44 6 developed develop VERB ital-3222 44 7 two two NUM ital-3222 44 8 variants variant NOUN ital-3222 44 9 of of ADP ital-3222 44 10 the the DET ital-3222 44 11 scheme scheme NOUN ital-3222 44 12 . . PUNCT ital-3222 45 1 the the DET ital-3222 45 2 basic basic ADJ ital-3222 45 3 ctdl ctdl NOUN ital-3222 45 4 includes include VERB ital-3222 45 5 stage stage NOUN ital-3222 45 6 2 2 NUM ital-3222 45 7 ; ; PUNCT ital-3222 45 8 therefore therefore ADV ital-3222 45 9 it it PRON ital-3222 45 10 can can AUX ital-3222 45 11 use use VERB ital-3222 45 12 a a DET ital-3222 45 13 document document NOUN ital-3222 45 14 - - PUNCT ital-3222 45 15 specific specific ADJ ital-3222 45 16 semidynamic semidynamic ADJ ital-3222 45 17 dictionary dictionary NOUN ital-3222 45 18 in in ADP ital-3222 45 19 the the DET ital-3222 45 20 third third ADJ ital-3222 45 21 stage stage NOUN ital-3222 45 22 . . PUNCT ital-3222 46 1 the the DET ital-3222 46 2 ctdl+ ctdl+ PROPN ital-3222 46 3 variant variant NOUN ital-3222 46 4 uses use VERB ital-3222 46 5 a a DET ital-3222 46 6 static static ADJ ital-3222 46 7 dictionary dictionary NOUN ital-3222 46 8 common common ADJ ital-3222 46 9 for for ADP ital-3222 46 10 all all DET ital-3222 46 11 files file NOUN ital-3222 46 12 in in ADP ital-3222 46 13 the the DET ital-3222 46 14 same same ADJ ital-3222 46 15 language language NOUN ital-3222 46 16 ; ; PUNCT ital-3222 46 17 therefore therefore ADV ital-3222 46 18 it it PRON ital-3222 46 19 can can AUX ital-3222 46 20 omit omit VERB ital-3222 46 21 stage stage NOUN ital-3222 46 22 2 2 NUM ital-3222 46 23 . . PUNCT ital-3222 46 24 during during ADP ital-3222 46 25 stage stage NOUN ital-3222 46 26 2 2 NUM ital-3222 46 27 , , PUNCT ital-3222 46 28 all all DET ital-3222 46 29 the the DET ital-3222 46 30 potential potential ADJ ital-3222 46 31 dictionary dictionary ADJ ital-3222 46 32 items item NOUN ital-3222 46 33 that that PRON ital-3222 46 34 meet meet VERB ital-3222 46 35 the the DET ital-3222 46 36 word word NOUN ital-3222 46 37 requirements requirement NOUN ital-3222 46 38 are be AUX ital-3222 46 39 extracted extract VERB ital-3222 46 40 from from ADP ital-3222 46 41 the the DET ital-3222 46 42 document document NOUN ital-3222 46 43 and and CCONJ ital-3222 46 44 then then ADV ital-3222 46 45 sorted sort VERB ital-3222 46 46 according accord VERB ital-3222 46 47 to to ADP ital-3222 46 48 their their PRON ital-3222 46 49 frequency frequency NOUN ital-3222 46 50 the the DET ital-3222 46 51 efficient efficient ADJ ital-3222 46 52 storage storage NOUN ital-3222 46 53 of of ADP ital-3222 46 54 text text NOUN ital-3222 46 55 documents document NOUN ital-3222 46 56 in in ADP ital-3222 46 57 digital digital ADJ ital-3222 46 58 libraries library NOUN ital-3222 46 59 | | NOUN ital-3222 46 60 skibiński skibiński NOUN ital-3222 46 61 and and CCONJ ital-3222 46 62 swacha swacha NOUN ital-3222 46 63 145 145 NUM ital-3222 46 64 to to PART ital-3222 46 65 form form VERB ital-3222 46 66 a a DET ital-3222 46 67 dictionary dictionary NOUN ital-3222 46 68 . . PUNCT ital-3222 47 1 the the DET ital-3222 47 2 requirements requirement NOUN ital-3222 47 3 define define VERB ital-3222 47 4 the the DET ital-3222 47 5 minimum minimum ADJ ital-3222 47 6 length length NOUN ital-3222 47 7 and and CCONJ ital-3222 47 8 frequency frequency NOUN ital-3222 47 9 of of ADP ital-3222 47 10 a a DET ital-3222 47 11 word word NOUN ital-3222 47 12 in in ADP ital-3222 47 13 the the DET ital-3222 47 14 document document NOUN ital-3222 47 15 ( ( PUNCT ital-3222 47 16 by by ADP ital-3222 47 17 default default NOUN ital-3222 47 18 , , PUNCT ital-3222 47 19 2 2 NUM ital-3222 47 20 and and CCONJ ital-3222 47 21 6 6 NUM ital-3222 47 22 respectively respectively ADV ital-3222 47 23 ) ) PUNCT ital-3222 47 24 as as ADV ital-3222 47 25 well well ADV ital-3222 47 26 as as ADP ital-3222 47 27 its its PRON ital-3222 47 28 content content NOUN ital-3222 47 29 . . PUNCT ital-3222 48 1 only only ADV ital-3222 48 2 the the DET ital-3222 48 3 following follow VERB ital-3222 48 4 kinds kind NOUN ital-3222 48 5 of of ADP ital-3222 48 6 strings string NOUN ital-3222 48 7 are be AUX ital-3222 48 8 accepted accept VERB ital-3222 48 9 into into ADP ital-3222 48 10 the the DET ital-3222 48 11 dictionary dictionary PROPN ital-3222 48 12 : : PUNCT ital-3222 48 13 n n ADV ital-3222 48 14 a a DET ital-3222 48 15 sequence sequence NOUN ital-3222 48 16 of of ADP ital-3222 48 17 lowercase lowercase NOUN ital-3222 48 18 and and CCONJ ital-3222 48 19 uppercase uppercase ADJ ital-3222 48 20 letters letter NOUN ital-3222 48 21 ( ( PUNCT ital-3222 48 22 “ " PUNCT ital-3222 48 23 a”–“z a”–“z SPACE ital-3222 48 24 ” " PUNCT ital-3222 48 25 , , PUNCT ital-3222 48 26 “ " PUNCT ital-3222 48 27 a”–“z a”–“z SPACE ital-3222 48 28 ” " PUNCT ital-3222 48 29 ) ) PUNCT ital-3222 48 30 and and CCONJ ital-3222 48 31 characters character NOUN ital-3222 48 32 with with ADP ital-3222 48 33 ascii ascii PROPN ital-3222 48 34 code code NOUN ital-3222 48 35 values value NOUN ital-3222 48 36 from from ADP ital-3222 48 37 range range NOUN ital-3222 48 38 128–255 128–255 NUM ital-3222 48 39 ( ( PUNCT ital-3222 48 40 thus thus ADV ital-3222 48 41 it it PRON ital-3222 48 42 supports support VERB ital-3222 48 43 any any DET ital-3222 48 44 typical typical ADJ ital-3222 48 45 8 8 NUM ital-3222 48 46 - - PUNCT ital-3222 48 47 bit bit NOUN ital-3222 48 48 text text NOUN ital-3222 48 49 encoding encoding NOUN ital-3222 48 50 and and CCONJ ital-3222 48 51 also also ADV ital-3222 48 52 utf-8 utf-8 SPACE ital-3222 48 53 ) ) PUNCT ital-3222 49 1 n n CCONJ ital-3222 49 2 url url NOUN ital-3222 49 3 address address NOUN ital-3222 49 4 prefixes prefix NOUN ital-3222 49 5 of of ADP ital-3222 49 6 the the DET ital-3222 49 7 form form NOUN ital-3222 49 8 “ " PUNCT ital-3222 49 9 http:// http:// VERB ital-3222 49 10 domain/ domain/ X ital-3222 49 11 , , PUNCT ital-3222 49 12 ” " PUNCT ital-3222 49 13 where where SCONJ ital-3222 49 14 domain domain NOUN ital-3222 49 15 is be AUX ital-3222 49 16 any any DET ital-3222 49 17 combination combination NOUN ital-3222 49 18 of of ADP ital-3222 49 19 letters letter NOUN ital-3222 49 20 , , PUNCT ital-3222 49 21 digits digit NOUN ital-3222 49 22 , , PUNCT ital-3222 49 23 dots dot NOUN ital-3222 49 24 , , PUNCT ital-3222 49 25 and and CCONJ ital-3222 49 26 dashes dash VERB ital-3222 49 27 n n ADV ital-3222 49 28 e e NOUN ital-3222 49 29 - - NOUN ital-3222 49 30 mails mail NOUN ital-3222 49 31 — — PUNCT ital-3222 49 32 patterns pattern NOUN ital-3222 49 33 of of ADP ital-3222 49 34 the the DET ital-3222 49 35 form form NOUN ital-3222 49 36 “ " PUNCT ital-3222 49 37 login@domain login@domain NOUN ital-3222 49 38 , , PUNCT ital-3222 49 39 ” " PUNCT ital-3222 49 40 where where SCONJ ital-3222 49 41 login login NOUN ital-3222 49 42 and and CCONJ ital-3222 49 43 domain domain NOUN ital-3222 49 44 are be AUX ital-3222 49 45 any any DET ital-3222 49 46 combination combination NOUN ital-3222 49 47 of of ADP ital-3222 49 48 letters letter NOUN ital-3222 49 49 , , PUNCT ital-3222 49 50 digits digit NOUN ital-3222 49 51 , , PUNCT ital-3222 49 52 dots dot NOUN ital-3222 49 53 , , PUNCT ital-3222 49 54 and and CCONJ ital-3222 49 55 dashes dash VERB ital-3222 49 56 n n ADV ital-3222 49 57 runs run NOUN ital-3222 49 58 of of ADP ital-3222 49 59 spaces space NOUN ital-3222 49 60 stage stage VERB ital-3222 49 61 3 3 NUM ital-3222 49 62 begins begin VERB ital-3222 49 63 with with ADP ital-3222 49 64 parsing parse VERB ital-3222 49 65 the the DET ital-3222 49 66 text text NOUN ital-3222 49 67 into into ADP ital-3222 49 68 tokens token NOUN ital-3222 49 69 . . PUNCT ital-3222 50 1 the the DET ital-3222 50 2 tokens token NOUN ital-3222 50 3 are be AUX ital-3222 50 4 defined define VERB ital-3222 50 5 by by ADP ital-3222 50 6 their their PRON ital-3222 50 7 content content NOUN ital-3222 50 8 ; ; PUNCT ital-3222 50 9 as as SCONJ ital-3222 50 10 four four NUM ital-3222 50 11 types type NOUN ital-3222 50 12 of of ADP ital-3222 50 13 content content NOUN ital-3222 50 14 are be AUX ital-3222 50 15 distinguished distinguish VERB ital-3222 50 16 , , PUNCT ital-3222 50 17 there there PRON ital-3222 50 18 are be VERB ital-3222 50 19 also also ADV ital-3222 50 20 four four NUM ital-3222 50 21 classes class NOUN ital-3222 50 22 of of ADP ital-3222 50 23 tokens token NOUN ital-3222 50 24 : : PUNCT ital-3222 50 25 words word NOUN ital-3222 50 26 , , PUNCT ital-3222 50 27 numbers number NOUN ital-3222 50 28 , , PUNCT ital-3222 50 29 special special ADJ ital-3222 50 30 tokens token NOUN ital-3222 50 31 , , PUNCT ital-3222 50 32 and and CCONJ ital-3222 50 33 characters character NOUN ital-3222 50 34 . . PUNCT ital-3222 51 1 every every DET ital-3222 51 2 token token NOUN ital-3222 51 3 is be AUX ital-3222 51 4 then then ADV ital-3222 51 5 encoded encode VERB ital-3222 51 6 in in ADP ital-3222 51 7 a a DET ital-3222 51 8 way way NOUN ital-3222 51 9 that that PRON ital-3222 51 10 depends depend VERB ital-3222 51 11 on on ADP ital-3222 51 12 the the DET ital-3222 51 13 class class NOUN ital-3222 51 14 it it PRON ital-3222 51 15 belongs belong VERB ital-3222 51 16 to to ADP ital-3222 51 17 . . PUNCT ital-3222 52 1 the the DET ital-3222 52 2 words word NOUN ital-3222 52 3 are be AUX ital-3222 52 4 those those DET ital-3222 52 5 character character NOUN ital-3222 52 6 sequences sequence NOUN ital-3222 52 7 that that PRON ital-3222 52 8 are be AUX ital-3222 52 9 listed list VERB ital-3222 52 10 in in ADP ital-3222 52 11 the the DET ital-3222 52 12 dictionary dictionary NOUN ital-3222 52 13 . . PUNCT ital-3222 53 1 every every DET ital-3222 53 2 word word NOUN ital-3222 53 3 is be AUX ital-3222 53 4 replaced replace VERB ital-3222 53 5 with with ADP ital-3222 53 6 its its PRON ital-3222 53 7 dictionary dictionary ADJ ital-3222 53 8 index index NOUN ital-3222 53 9 , , PUNCT ital-3222 53 10 which which PRON ital-3222 53 11 is be AUX ital-3222 53 12 then then ADV ital-3222 53 13 encoded encode VERB ital-3222 53 14 using use VERB ital-3222 53 15 symbols symbol NOUN ital-3222 53 16 that that PRON ital-3222 53 17 are be AUX ital-3222 53 18 rare rare ADJ ital-3222 53 19 or or CCONJ ital-3222 53 20 nonexistent nonexistent ADJ ital-3222 53 21 in in ADP ital-3222 53 22 the the DET ital-3222 53 23 input input NOUN ital-3222 53 24 document document NOUN ital-3222 53 25 . . PUNCT ital-3222 54 1 indexes index NOUN ital-3222 54 2 are be AUX ital-3222 54 3 encoded encode VERB ital-3222 54 4 with with ADP ital-3222 54 5 code code NOUN ital-3222 54 6 words word NOUN ital-3222 54 7 that that PRON ital-3222 54 8 are be AUX ital-3222 54 9 between between ADP ital-3222 54 10 one one NUM ital-3222 54 11 and and CCONJ ital-3222 54 12 four four NUM ital-3222 54 13 bytes byte NOUN ital-3222 54 14 long long ADV ital-3222 54 15 , , PUNCT ital-3222 54 16 with with ADP ital-3222 54 17 lower low ADJ ital-3222 54 18 indexes index NOUN ital-3222 54 19 ( ( PUNCT ital-3222 54 20 denoting denote VERB ital-3222 54 21 more more ADV ital-3222 54 22 frequent frequent ADJ ital-3222 54 23 words word NOUN ital-3222 54 24 ) ) PUNCT ital-3222 54 25 being be AUX ital-3222 54 26 assigned assign VERB ital-3222 54 27 shorter short ADJ ital-3222 54 28 code code NOUN ital-3222 54 29 words word NOUN ital-3222 54 30 . . PUNCT ital-3222 55 1 the the DET ital-3222 55 2 numbers number NOUN ital-3222 55 3 are be AUX ital-3222 55 4 sequences sequence NOUN ital-3222 55 5 of of ADP ital-3222 55 6 decimal decimal ADJ ital-3222 55 7 digits digit NOUN ital-3222 55 8 , , PUNCT ital-3222 55 9 which which PRON ital-3222 55 10 are be AUX ital-3222 55 11 encoded encode VERB ital-3222 55 12 with with ADP ital-3222 55 13 a a DET ital-3222 55 14 dense dense ADJ ital-3222 55 15 binary binary ADJ ital-3222 55 16 code code NOUN ital-3222 55 17 , , PUNCT ital-3222 55 18 and and CCONJ ital-3222 55 19 , , PUNCT ital-3222 55 20 similarly similarly ADV ital-3222 55 21 to to ADP ital-3222 55 22 letters letter NOUN ital-3222 55 23 , , PUNCT ital-3222 55 24 placed place VERB ital-3222 55 25 in in ADP ital-3222 55 26 a a DET ital-3222 55 27 separate separate ADJ ital-3222 55 28 location location NOUN ital-3222 55 29 in in ADP ital-3222 55 30 the the DET ital-3222 55 31 output output NOUN ital-3222 55 32 file file NOUN ital-3222 55 33 . . PUNCT ital-3222 56 1 the the DET ital-3222 56 2 special special ADJ ital-3222 56 3 tokens token NOUN ital-3222 56 4 can can AUX ital-3222 56 5 be be AUX ital-3222 56 6 decimal decimal ADJ ital-3222 56 7 fractions fraction NOUN ital-3222 56 8 , , PUNCT ital-3222 56 9 ip ip NOUN ital-3222 56 10 numerical numerical PROPN ital-3222 56 11 addresses address NOUN ital-3222 56 12 , , PUNCT ital-3222 56 13 dates date NOUN ital-3222 56 14 , , PUNCT ital-3222 56 15 times time NOUN ital-3222 56 16 , , PUNCT ital-3222 56 17 and and CCONJ ital-3222 56 18 numerical numerical ADJ ital-3222 56 19 ranges range NOUN ital-3222 56 20 . . PUNCT ital-3222 57 1 as as SCONJ ital-3222 57 2 they they PRON ital-3222 57 3 have have VERB ital-3222 57 4 a a DET ital-3222 57 5 strict strict ADJ ital-3222 57 6 format format NOUN ital-3222 57 7 and and CCONJ ital-3222 57 8 differ differ VERB ital-3222 57 9 only only ADV ital-3222 57 10 in in ADP ital-3222 57 11 numerical numerical ADJ ital-3222 57 12 values value NOUN ital-3222 57 13 , , PUNCT ital-3222 57 14 they they PRON ital-3222 57 15 are be AUX ital-3222 57 16 encoded encode VERB ital-3222 57 17 as as ADP ital-3222 57 18 sequences sequence NOUN ital-3222 57 19 of of ADP ital-3222 57 20 numbers.18 numbers.18 PROPN ital-3222 57 21 finally finally ADV ital-3222 57 22 , , PUNCT ital-3222 57 23 the the DET ital-3222 57 24 characters character NOUN ital-3222 57 25 are be AUX ital-3222 57 26 the the DET ital-3222 57 27 tokens token NOUN ital-3222 57 28 that that PRON ital-3222 57 29 do do AUX ital-3222 57 30 not not PART ital-3222 57 31 belong belong VERB ital-3222 57 32 to to ADP ital-3222 57 33 any any PRON ital-3222 57 34 of of ADP ital-3222 57 35 the the DET ital-3222 57 36 aforementioned aforementione VERB ital-3222 57 37 group group NOUN ital-3222 57 38 . . PUNCT ital-3222 58 1 they they PRON ital-3222 58 2 are be AUX ital-3222 58 3 simply simply ADV ital-3222 58 4 copied copy VERB ital-3222 58 5 to to ADP ital-3222 58 6 the the DET ital-3222 58 7 output output NOUN ital-3222 58 8 file file NOUN ital-3222 58 9 , , PUNCT ital-3222 58 10 with with ADP ital-3222 58 11 the the DET ital-3222 58 12 exception exception NOUN ital-3222 58 13 of of ADP ital-3222 58 14 those those DET ital-3222 58 15 rare rare ADJ ital-3222 58 16 characters character NOUN ital-3222 58 17 that that PRON ital-3222 58 18 were be AUX ital-3222 58 19 used use VERB ital-3222 58 20 to to PART ital-3222 58 21 construct construct VERB ital-3222 58 22 code code NOUN ital-3222 58 23 words word NOUN ital-3222 58 24 ; ; PUNCT ital-3222 58 25 they they PRON ital-3222 58 26 are be AUX ital-3222 58 27 copied copy VERB ital-3222 58 28 as as ADV ital-3222 58 29 well well ADV ital-3222 58 30 , , PUNCT ital-3222 58 31 but but CCONJ ital-3222 58 32 have have VERB ital-3222 58 33 to to PART ital-3222 58 34 be be AUX ital-3222 58 35 preceded precede VERB ital-3222 58 36 with with ADP ital-3222 58 37 a a DET ital-3222 58 38 special special ADJ ital-3222 58 39 escape escape NOUN ital-3222 58 40 symbol symbol NOUN ital-3222 58 41 . . PUNCT ital-3222 59 1 the the DET ital-3222 59 2 specialized specialized ADJ ital-3222 59 3 transform transform NOUN ital-3222 59 4 variants variant NOUN ital-3222 59 5 ( ( PUNCT ital-3222 59 6 see see VERB ital-3222 59 7 the the DET ital-3222 59 8 next next ADJ ital-3222 59 9 section section NOUN ital-3222 59 10 ) ) PUNCT ital-3222 59 11 distinguish distinguish VERB ital-3222 59 12 three three NUM ital-3222 59 13 additional additional ADJ ital-3222 59 14 classes class NOUN ital-3222 59 15 from from ADP ital-3222 59 16 the the DET ital-3222 59 17 character character NOUN ital-3222 59 18 class class NOUN ital-3222 59 19 : : PUNCT ital-3222 59 20 letters letter NOUN ital-3222 59 21 ( ( PUNCT ital-3222 59 22 words word NOUN ital-3222 59 23 not not PART ital-3222 59 24 in in ADP ital-3222 59 25 the the DET ital-3222 59 26 dictionary dictionary NOUN ital-3222 59 27 ) ) PUNCT ital-3222 59 28 , , PUNCT ital-3222 59 29 single single ADJ ital-3222 59 30 white white ADJ ital-3222 59 31 spaces space NOUN ital-3222 59 32 , , PUNCT ital-3222 59 33 and and CCONJ ital-3222 59 34 multiple multiple ADJ ital-3222 59 35 white white ADJ ital-3222 59 36 spaces space NOUN ital-3222 59 37 . . PUNCT ital-3222 60 1 stage stage NOUN ital-3222 60 2 4 4 NUM ital-3222 60 3 could could AUX ital-3222 60 4 use use VERB ital-3222 60 5 any any DET ital-3222 60 6 general general ADJ ital-3222 60 7 - - PUNCT ital-3222 60 8 purpose purpose NOUN ital-3222 60 9 compression compression NOUN ital-3222 60 10 method method NOUN ital-3222 60 11 to to PART ital-3222 60 12 encode encode VERB ital-3222 60 13 the the DET ital-3222 60 14 output output NOUN ital-3222 60 15 of of ADP ital-3222 60 16 stage stage NOUN ital-3222 60 17 3 3 NUM ital-3222 60 18 . . PUNCT ital-3222 61 1 for for ADP ital-3222 61 2 this this DET ital-3222 61 3 role role NOUN ital-3222 61 4 , , PUNCT ital-3222 61 5 we we PRON ital-3222 61 6 have have AUX ital-3222 61 7 investigated investigate VERB ital-3222 61 8 several several ADJ ital-3222 61 9 open open ADV ital-3222 61 10 - - PUNCT ital-3222 61 11 licensed licensed ADJ ital-3222 61 12 , , PUNCT ital-3222 61 13 generalpurpose generalpurpose NOUN ital-3222 61 14 compression compression NOUN ital-3222 61 15 algorithms algorithm NOUN ital-3222 61 16 that that PRON ital-3222 61 17 differ differ VERB ital-3222 61 18 in in ADP ital-3222 61 19 speed speed NOUN ital-3222 61 20 and and CCONJ ital-3222 61 21 efficiency efficiency NOUN ital-3222 61 22 . . PUNCT ital-3222 62 1 as as SCONJ ital-3222 62 2 we we PRON ital-3222 62 3 believe believe VERB ital-3222 62 4 that that SCONJ ital-3222 62 5 document document NOUN ital-3222 62 6 access access NOUN ital-3222 62 7 speed speed NOUN ital-3222 62 8 is be AUX ital-3222 62 9 important important ADJ ital-3222 62 10 to to ADP ital-3222 62 11 textual textual ADJ ital-3222 62 12 digital digital ADJ ital-3222 62 13 libraries library NOUN ital-3222 62 14 , , PUNCT ital-3222 62 15 we we PRON ital-3222 62 16 have have AUX ital-3222 62 17 decided decide VERB ital-3222 62 18 to to PART ital-3222 62 19 focus focus VERB ital-3222 62 20 on on ADP ital-3222 62 21 lz lz PROPN ital-3222 62 22 – – PUNCT ital-3222 62 23 type type NOUN ital-3222 62 24 algorithms algorithm NOUN ital-3222 62 25 because because SCONJ ital-3222 62 26 they they PRON ital-3222 62 27 offer offer VERB ital-3222 62 28 the the DET ital-3222 62 29 best good ADJ ital-3222 62 30 decompression decompression NOUN ital-3222 62 31 times time NOUN ital-3222 62 32 . . PUNCT ital-3222 63 1 ctdl ctdl PROPN ital-3222 63 2 has have AUX ital-3222 63 3 two two NUM ital-3222 63 4 embedded embed VERB ital-3222 63 5 backend backend ADJ ital-3222 63 6 compressors compressor NOUN ital-3222 63 7 : : PUNCT ital-3222 63 8 the the DET ital-3222 63 9 standard standard ADJ ital-3222 63 10 deflate deflate PROPN ital-3222 63 11 and and CCONJ ital-3222 63 12 lzma lzma PROPN ital-3222 63 13 , , PUNCT ital-3222 63 14 wellknown wellknown ADJ ital-3222 63 15 for for ADP ital-3222 63 16 its its PRON ital-3222 63 17 ability ability NOUN ital-3222 63 18 to to PART ital-3222 63 19 attain attain VERB ital-3222 63 20 high high ADJ ital-3222 63 21 compression compression NOUN ital-3222 63 22 ratios.19 ratios.19 NOUN ital-3222 63 23 n n ADP ital-3222 63 24 adapting adapt VERB ital-3222 63 25 the the DET ital-3222 63 26 transform transform NOUN ital-3222 63 27 for for ADP ital-3222 63 28 individual individual ADJ ital-3222 63 29 text text NOUN ital-3222 63 30 document document NOUN ital-3222 63 31 formats format VERB ital-3222 63 32 the the DET ital-3222 63 33 text text NOUN ital-3222 63 34 document document NOUN ital-3222 63 35 formats format NOUN ital-3222 63 36 have have VERB ital-3222 63 37 individual individual ADJ ital-3222 63 38 characteristics characteristic NOUN ital-3222 63 39 ; ; PUNCT ital-3222 63 40 therefore therefore ADV ital-3222 63 41 the the DET ital-3222 63 42 compression compression NOUN ital-3222 63 43 ratio ratio NOUN ital-3222 63 44 can can AUX ital-3222 63 45 be be AUX ital-3222 63 46 improved improve VERB ital-3222 63 47 by by ADP ital-3222 63 48 adapting adapt VERB ital-3222 63 49 the the DET ital-3222 63 50 transform transform NOUN ital-3222 63 51 for for ADP ital-3222 63 52 a a DET ital-3222 63 53 particular particular ADJ ital-3222 63 54 format format NOUN ital-3222 63 55 . . PUNCT ital-3222 64 1 as as SCONJ ital-3222 64 2 we we PRON ital-3222 64 3 noted note VERB ital-3222 64 4 in in ADP ital-3222 64 5 the the DET ital-3222 64 6 introduction introduction NOUN ital-3222 64 7 , , PUNCT ital-3222 64 8 we we PRON ital-3222 64 9 propose propose VERB ital-3222 64 10 a a DET ital-3222 64 11 set set NOUN ital-3222 64 12 of of ADP ital-3222 64 13 subschemes subscheme NOUN ital-3222 64 14 ( ( PUNCT ital-3222 64 15 modifications modification NOUN ital-3222 64 16 of of ADP ital-3222 64 17 the the DET ital-3222 64 18 original original ADJ ital-3222 64 19 processing processing NOUN ital-3222 64 20 steps step NOUN ital-3222 64 21 or or CCONJ ital-3222 64 22 additional additional ADJ ital-3222 64 23 processing processing NOUN ital-3222 64 24 steps step NOUN ital-3222 64 25 ) ) PUNCT ital-3222 64 26 that that PRON ital-3222 64 27 can can AUX ital-3222 64 28 help help VERB ital-3222 64 29 compression compression NOUN ital-3222 64 30 — — PUNCT ital-3222 64 31 provided provide VERB ital-3222 64 32 the the DET ital-3222 64 33 issue issue NOUN ital-3222 64 34 that that SCONJ ital-3222 64 35 a a DET ital-3222 64 36 given give VERB ital-3222 64 37 subscheme subscheme NOUN ital-3222 64 38 addresses address NOUN ital-3222 64 39 is be AUX ital-3222 64 40 valid valid ADJ ital-3222 64 41 for for SCONJ ital-3222 64 42 the the DET ital-3222 64 43 document document NOUN ital-3222 64 44 format format NOUN ital-3222 64 45 being be AUX ital-3222 64 46 compressed compress VERB ital-3222 64 47 . . PUNCT ital-3222 65 1 there there PRON ital-3222 65 2 are be VERB ital-3222 65 3 two two NUM ital-3222 65 4 groups group NOUN ital-3222 65 5 of of ADP ital-3222 65 6 subschemes subscheme NOUN ital-3222 65 7 : : PUNCT ital-3222 65 8 the the DET ital-3222 65 9 first first ADJ ital-3222 65 10 consists consist VERB ital-3222 65 11 of of ADP ital-3222 65 12 solutions solution NOUN ital-3222 65 13 that that PRON ital-3222 65 14 can can AUX ital-3222 65 15 be be AUX ital-3222 65 16 applied apply VERB ital-3222 65 17 to to ADP ital-3222 65 18 more more ADJ ital-3222 65 19 than than ADP ital-3222 65 20 one one NUM ital-3222 65 21 document document NOUN ital-3222 65 22 format format NOUN ital-3222 65 23 . . PUNCT ital-3222 66 1 it it PRON ital-3222 66 2 includes include VERB ital-3222 66 3 n n ADP ital-3222 66 4 changing change VERB ital-3222 66 5 the the DET ital-3222 66 6 minimum minimum ADJ ital-3222 66 7 word word NOUN ital-3222 66 8 frequency frequency NOUN ital-3222 66 9 threshold threshold NOUN ital-3222 66 10 ( ( PUNCT ital-3222 66 11 the the DET ital-3222 66 12 “ " PUNCT ital-3222 66 13 minfr minfr ADJ ital-3222 66 14 ” " PUNCT ital-3222 66 15 column column NOUN ital-3222 66 16 in in ADP ital-3222 66 17 table table NOUN ital-3222 66 18 1 1 NUM ital-3222 66 19 ) ) PUNCT ital-3222 66 20 that that SCONJ ital-3222 66 21 a a DET ital-3222 66 22 word word NOUN ital-3222 66 23 must must AUX ital-3222 66 24 pass pass VERB ital-3222 66 25 to to PART ital-3222 66 26 be be AUX ital-3222 66 27 included include VERB ital-3222 66 28 in in ADP ital-3222 66 29 the the DET ital-3222 66 30 semidynamic semidynamic ADJ ital-3222 66 31 dictionary dictionary NOUN ital-3222 66 32 ( ( PUNCT ital-3222 66 33 notice notice VERB ital-3222 66 34 that that SCONJ ital-3222 66 35 no no DET ital-3222 66 36 word word NOUN ital-3222 66 37 can can AUX ital-3222 66 38 be be AUX ital-3222 66 39 added add VERB ital-3222 66 40 to to ADP ital-3222 66 41 a a DET ital-3222 66 42 static static ADJ ital-3222 66 43 dictionary dictionary NOUN ital-3222 66 44 ) ) PUNCT ital-3222 66 45 ; ; PUNCT ital-3222 66 46 n n X ital-3222 66 47 using use VERB ital-3222 66 48 spaceless spaceless NOUN ital-3222 66 49 word word NOUN ital-3222 66 50 model model NOUN ital-3222 66 51 ( ( PUNCT ital-3222 66 52 “ " PUNCT ital-3222 66 53 wdspc wdspc NOUN ital-3222 66 54 ” " PUNCT ital-3222 66 55 column column NOUN ital-3222 66 56 in in ADP ital-3222 66 57 table table NOUN ital-3222 66 58 1 1 NUM ital-3222 66 59 ) ) PUNCT ital-3222 66 60 in in ADP ital-3222 66 61 which which PRON ital-3222 66 62 a a DET ital-3222 66 63 single single ADJ ital-3222 66 64 space space NOUN ital-3222 66 65 between between ADP ital-3222 66 66 two two NUM ital-3222 66 67 words word NOUN ital-3222 66 68 is be AUX ital-3222 66 69 not not PART ital-3222 66 70 encoded encode VERB ital-3222 66 71 at at ADV ital-3222 66 72 all all ADV ital-3222 66 73 ; ; PUNCT ital-3222 66 74 instead instead ADV ital-3222 66 75 , , PUNCT ital-3222 66 76 a a DET ital-3222 66 77 flag flag NOUN ital-3222 66 78 is be AUX ital-3222 66 79 used use VERB ital-3222 66 80 to to PART ital-3222 66 81 mark mark VERB ital-3222 66 82 two two NUM ital-3222 66 83 neighboring neighboring NOUN ital-3222 66 84 words word NOUN ital-3222 66 85 that that PRON ital-3222 66 86 are be AUX ital-3222 66 87 not not PART ital-3222 66 88 separated separate VERB ital-3222 66 89 by by ADP ital-3222 66 90 a a DET ital-3222 66 91 space space NOUN ital-3222 66 92 ; ; PUNCT ital-3222 66 93 n n CCONJ ital-3222 66 94 run run VERB ital-3222 66 95 - - PUNCT ital-3222 66 96 length length NOUN ital-3222 66 97 encoding encoding NOUN ital-3222 66 98 of of ADP ital-3222 66 99 multiple multiple ADJ ital-3222 66 100 spaces space NOUN ital-3222 66 101 ( ( PUNCT ital-3222 66 102 “ " PUNCT ital-3222 66 103 spruns sprun NOUN ital-3222 66 104 ” " PUNCT ital-3222 66 105 column column NOUN ital-3222 66 106 in in ADP ital-3222 66 107 table table NOUN ital-3222 66 108 1 1 NUM ital-3222 66 109 ) ) PUNCT ital-3222 66 110 ; ; PUNCT ital-3222 66 111 n n CCONJ ital-3222 66 112 letter letter NOUN ital-3222 66 113 containers container NOUN ital-3222 66 114 ( ( PUNCT ital-3222 66 115 “ " PUNCT ital-3222 66 116 letcnt letcnt ADJ ital-3222 66 117 ” " PUNCT ital-3222 66 118 column column NOUN ital-3222 66 119 in in ADP ital-3222 66 120 table table NOUN ital-3222 66 121 1 1 NUM ital-3222 66 122 ) ) PUNCT ital-3222 66 123 , , PUNCT ital-3222 66 124 that that ADV ital-3222 66 125 is is ADV ital-3222 66 126 , , PUNCT ital-3222 66 127 removing remove VERB ital-3222 66 128 sequences sequence NOUN ital-3222 66 129 of of ADP ital-3222 66 130 letters letter NOUN ital-3222 66 131 ( ( PUNCT ital-3222 66 132 belonging belong VERB ital-3222 66 133 to to ADP ital-3222 66 134 words word NOUN ital-3222 66 135 that that PRON ital-3222 66 136 are be AUX ital-3222 66 137 not not PART ital-3222 66 138 included include VERB ital-3222 66 139 in in ADP ital-3222 66 140 the the DET ital-3222 66 141 dictionary dictionary PROPN ital-3222 66 142 ) ) PUNCT ital-3222 66 143 to to ADP ital-3222 66 144 a a DET ital-3222 66 145 separate separate ADJ ital-3222 66 146 location location NOUN ital-3222 66 147 in in ADP ital-3222 66 148 the the DET ital-3222 66 149 output output NOUN ital-3222 66 150 file file NOUN ital-3222 66 151 ( ( PUNCT ital-3222 66 152 and and CCONJ ital-3222 66 153 leaving leave VERB ital-3222 66 154 a a DET ital-3222 66 155 flag flag NOUN ital-3222 66 156 at at ADP ital-3222 66 157 their their PRON ital-3222 66 158 original original ADJ ital-3222 66 159 position position NOUN ital-3222 66 160 ) ) PUNCT ital-3222 66 161 . . PUNCT ital-3222 67 1 table table NOUN ital-3222 67 2 1 1 NUM ital-3222 67 3 shows show VERB ital-3222 67 4 the the DET ital-3222 67 5 assignment assignment NOUN ital-3222 67 6 of of ADP ital-3222 67 7 the the DET ital-3222 67 8 mentioned mention VERB ital-3222 67 9 subschemes subscheme NOUN ital-3222 67 10 to to PART ital-3222 67 11 document document NOUN ital-3222 67 12 formats format NOUN ital-3222 67 13 , , PUNCT ital-3222 67 14 with with ADP ital-3222 67 15 “ " PUNCT ital-3222 67 16 + + NOUN ital-3222 67 17 ” " PUNCT ital-3222 67 18 denoting denote VERB ital-3222 67 19 that that SCONJ ital-3222 67 20 a a DET ital-3222 67 21 given give VERB ital-3222 67 22 subscheme subscheme NOUN ital-3222 67 23 should should AUX ital-3222 67 24 be be AUX ital-3222 67 25 applied apply VERB ital-3222 67 26 when when SCONJ ital-3222 67 27 processing process VERB ital-3222 67 28 a a DET ital-3222 67 29 given give VERB ital-3222 67 30 document document NOUN ital-3222 67 31 format format NOUN ital-3222 67 32 . . PUNCT ital-3222 68 1 notice notice VERB ital-3222 68 2 that that SCONJ ital-3222 68 3 we we PRON ital-3222 68 4 use use VERB ital-3222 68 5 different different ADJ ital-3222 68 6 subschemes subscheme NOUN ital-3222 68 7 for for ADP ital-3222 68 8 the the DET ital-3222 68 9 same same ADJ ital-3222 68 10 format format NOUN ital-3222 68 11 depending depend VERB ital-3222 68 12 on on ADP ital-3222 68 13 whether whether SCONJ ital-3222 68 14 a a DET ital-3222 68 15 semidynamic semidynamic ADJ ital-3222 68 16 ( ( PUNCT ital-3222 68 17 ctdl ctdl NOUN ital-3222 68 18 ) ) PUNCT ital-3222 68 19 or or CCONJ ital-3222 68 20 static static ADJ ital-3222 68 21 ( ( PUNCT ital-3222 68 22 ctdl+ ctdl+ PROPN ital-3222 68 23 ) ) PUNCT ital-3222 68 24 dictionary dictionary PROPN ital-3222 68 25 is be AUX ital-3222 68 26 used use VERB ital-3222 68 27 . . PUNCT ital-3222 69 1 the the DET ital-3222 69 2 remaining remain VERB ital-3222 69 3 subschemes subscheme NOUN ital-3222 69 4 are be AUX ital-3222 69 5 applied apply VERB ital-3222 69 6 for for ADP ital-3222 69 7 only only ADV ital-3222 69 8 one one NUM ital-3222 69 9 document document NOUN ital-3222 69 10 format format NOUN ital-3222 69 11 . . PUNCT ital-3222 70 1 they they PRON ital-3222 70 2 attain attain VERB ital-3222 70 3 an an DET ital-3222 70 4 improvement improvement NOUN ital-3222 70 5 in in ADP ital-3222 70 6 compression compression NOUN ital-3222 70 7 performance performance NOUN ital-3222 70 8 by by ADP ital-3222 70 9 changing change VERB ital-3222 70 10 the the DET ital-3222 70 11 definition definition NOUN ital-3222 70 12 of of ADP ital-3222 70 13 acceptable acceptable ADJ ital-3222 70 14 dictionary dictionary ADJ ital-3222 70 15 words word NOUN ital-3222 70 16 , , PUNCT ital-3222 70 17 and and CCONJ ital-3222 70 18 , , PUNCT ital-3222 70 19 in in ADP ital-3222 70 20 one one NUM ital-3222 70 21 case case NOUN ital-3222 70 22 ( ( PUNCT ital-3222 70 23 ps ps PROPN ital-3222 70 24 ) ) PUNCT ital-3222 70 25 , , PUNCT ital-3222 70 26 by by ADP ital-3222 70 27 changing change VERB ital-3222 70 28 the the DET ital-3222 70 29 definition definition NOUN ital-3222 70 30 of of ADP ital-3222 70 31 number number NOUN ital-3222 70 32 strings string NOUN ital-3222 70 33 . . PUNCT ital-3222 71 1 the the DET ital-3222 71 2 encoder encoder NOUN ital-3222 71 3 for for ADP ital-3222 71 4 the the DET ital-3222 71 5 simplest simple ADJ ital-3222 71 6 of of ADP ital-3222 71 7 the the DET ital-3222 71 8 examined examine VERB ital-3222 71 9 formats format NOUN ital-3222 71 10 — — PUNCT ital-3222 71 11 plain plain ADJ ital-3222 71 12 text text NOUN ital-3222 71 13 files file NOUN ital-3222 71 14 — — PUNCT ital-3222 71 15 performs perform VERB ital-3222 71 16 no no DET ital-3222 71 17 additional additional ADJ ital-3222 71 18 formatspecific formatspecific ADJ ital-3222 71 19 processing processing NOUN ital-3222 71 20 . . PUNCT ital-3222 72 1 the the DET ital-3222 72 2 first first ADJ ital-3222 72 3 such such ADJ ital-3222 72 4 modification modification NOUN ital-3222 72 5 is be AUX ital-3222 72 6 in in ADP ital-3222 72 7 the the DET ital-3222 72 8 tex tex PROPN ital-3222 72 9 encoder encoder PROPN ital-3222 72 10 . . PUNCT ital-3222 73 1 the the DET ital-3222 73 2 difference difference NOUN ital-3222 73 3 is be AUX ital-3222 73 4 that that SCONJ ital-3222 73 5 words word NOUN ital-3222 73 6 beginning begin VERB ital-3222 73 7 with with ADP ital-3222 73 8 “ " PUNCT ital-3222 73 9 \ \ PROPN ital-3222 73 10 ” " PUNCT ital-3222 73 11 ( ( PUNCT ital-3222 73 12 tex tex PROPN ital-3222 73 13 146 146 NUM ital-3222 73 14 information information NOUN ital-3222 73 15 technology technology NOUN ital-3222 73 16 and and CCONJ ital-3222 73 17 libraries library NOUN ital-3222 73 18 | | NOUN ital-3222 73 19 september september PROPN ital-3222 73 20 2009 2009 NUM ital-3222 73 21 instructions instruction NOUN ital-3222 73 22 ) ) PUNCT ital-3222 73 23 are be AUX ital-3222 73 24 now now ADV ital-3222 73 25 accepted accept VERB ital-3222 73 26 in in ADP ital-3222 73 27 the the DET ital-3222 73 28 dictionary dictionary NOUN ital-3222 73 29 . . PUNCT ital-3222 74 1 the the DET ital-3222 74 2 modification modification NOUN ital-3222 74 3 for for ADP ital-3222 74 4 pdf pdf NOUN ital-3222 74 5 documents document NOUN ital-3222 74 6 is be AUX ital-3222 74 7 similar similar ADJ ital-3222 74 8 . . PUNCT ital-3222 75 1 in in ADP ital-3222 75 2 this this DET ital-3222 75 3 case case NOUN ital-3222 75 4 , , PUNCT ital-3222 75 5 bracketed bracket VERB ital-3222 75 6 words word NOUN ital-3222 75 7 ( ( PUNCT ital-3222 75 8 pdf pdf NOUN ital-3222 75 9 entities entity NOUN ital-3222 75 10 ) ) PUNCT ital-3222 75 11 — — PUNCT ital-3222 75 12 for for ADP ital-3222 75 13 example example NOUN ital-3222 75 14 “ " PUNCT ital-3222 75 15 ( ( PUNCT ital-3222 75 16 abc)”—are abc)”—are NOUN ital-3222 75 17 acceptable acceptable ADJ ital-3222 75 18 as as ADP ital-3222 75 19 dictionary dictionary ADJ ital-3222 75 20 entries entry NOUN ital-3222 75 21 . . PUNCT ital-3222 76 1 notice notice VERB ital-3222 76 2 that that SCONJ ital-3222 76 3 pdf pdf NOUN ital-3222 76 4 files file NOUN ital-3222 76 5 are be AUX ital-3222 76 6 internally internally ADV ital-3222 76 7 compressed compress VERB ital-3222 76 8 by by ADP ital-3222 76 9 default default NOUN ital-3222 76 10 — — PUNCT ital-3222 76 11 the the DET ital-3222 76 12 transform transform NOUN ital-3222 76 13 can can AUX ital-3222 76 14 be be AUX ital-3222 76 15 applied apply VERB ital-3222 76 16 after after ADP ital-3222 76 17 decompressing decompress VERB ital-3222 76 18 them they PRON ital-3222 76 19 into into ADP ital-3222 76 20 textual textual ADJ ital-3222 76 21 format format NOUN ital-3222 76 22 . . PUNCT ital-3222 77 1 the the DET ital-3222 77 2 precomp precomp NOUN ital-3222 77 3 tool tool NOUN ital-3222 77 4 is be AUX ital-3222 77 5 used use VERB ital-3222 77 6 for for ADP ital-3222 77 7 this this DET ital-3222 77 8 purpose purpose NOUN ital-3222 77 9 . . PUNCT ital-3222 78 1 the the DET ital-3222 78 2 subscheme subscheme NOUN ital-3222 78 3 for for ADP ital-3222 78 4 ps ps NOUN ital-3222 78 5 files file NOUN ital-3222 78 6 features feature VERB ital-3222 78 7 two two NUM ital-3222 78 8 modifications modification NOUN ital-3222 78 9 : : PUNCT ital-3222 78 10 its its PRON ital-3222 78 11 dictionary dictionary ADJ ital-3222 78 12 accepts accept VERB ital-3222 78 13 words word NOUN ital-3222 78 14 beginning begin VERB ital-3222 78 15 with with ADP ital-3222 78 16 “ " PUNCT ital-3222 78 17 / / SYM ital-3222 78 18 ” " PUNCT ital-3222 78 19 and and CCONJ ital-3222 78 20 “ " PUNCT ital-3222 78 21 \ \ PROPN ital-3222 78 22 ” " PUNCT ital-3222 78 23 or or CCONJ ital-3222 78 24 ending end VERB ital-3222 78 25 with with ADP ital-3222 78 26 “ " PUNCT ital-3222 78 27 ( ( PUNCT ital-3222 78 28 “ " PUNCT ital-3222 78 29 , , PUNCT ital-3222 78 30 and and CCONJ ital-3222 78 31 its its PRON ital-3222 78 32 number number NOUN ital-3222 78 33 tokens token NOUN ital-3222 78 34 can can AUX ital-3222 78 35 contain contain VERB ital-3222 78 36 not not PART ital-3222 78 37 only only ADV ital-3222 78 38 decimal decimal ADJ ital-3222 78 39 but but CCONJ ital-3222 78 40 also also ADV ital-3222 78 41 hexadecimal hexadecimal ADJ ital-3222 78 42 digits digit NOUN ital-3222 78 43 ( ( PUNCT ital-3222 78 44 though though SCONJ ital-3222 78 45 a a DET ital-3222 78 46 single single ADJ ital-3222 78 47 number number NOUN ital-3222 78 48 must must AUX ital-3222 78 49 have have VERB ital-3222 78 50 at at ADV ital-3222 78 51 least least ADV ital-3222 78 52 one one NUM ital-3222 78 53 decimal decimal ADJ ital-3222 78 54 digit digit NOUN ital-3222 78 55 ) ) PUNCT ital-3222 78 56 . . PUNCT ital-3222 79 1 the the DET ital-3222 79 2 hexadecimal hexadecimal ADJ ital-3222 79 3 number number NOUN ital-3222 79 4 must must AUX ital-3222 79 5 be be AUX ital-3222 79 6 at at ADP ital-3222 79 7 least least ADJ ital-3222 79 8 6 6 NUM ital-3222 79 9 digits digit NOUN ital-3222 79 10 long long ADV ital-3222 79 11 , , PUNCT ital-3222 79 12 and and CCONJ ital-3222 79 13 is be AUX ital-3222 79 14 encoded encode VERB ital-3222 79 15 with with ADP ital-3222 79 16 a a DET ital-3222 79 17 flag flag NOUN ital-3222 79 18 : : PUNCT ital-3222 79 19 a a DET ital-3222 79 20 byte byte NOUN ital-3222 79 21 containing contain VERB ital-3222 79 22 its its PRON ital-3222 79 23 length length NOUN ital-3222 79 24 ( ( PUNCT ital-3222 79 25 numbers number NOUN ital-3222 79 26 with with ADP ital-3222 79 27 more more ADJ ital-3222 79 28 than than ADP ital-3222 79 29 261 261 NUM ital-3222 79 30 digits digit NOUN ital-3222 79 31 are be AUX ital-3222 79 32 split split VERB ital-3222 79 33 into into ADP ital-3222 79 34 parts part NOUN ital-3222 79 35 ) ) PUNCT ital-3222 79 36 and and CCONJ ital-3222 79 37 a a DET ital-3222 79 38 sequence sequence NOUN ital-3222 79 39 of of ADP ital-3222 79 40 bytes byte NOUN ital-3222 79 41 , , PUNCT ital-3222 79 42 each each PRON ital-3222 79 43 containing contain VERB ital-3222 79 44 two two NUM ital-3222 79 45 digits digit NOUN ital-3222 79 46 from from ADP ital-3222 79 47 the the DET ital-3222 79 48 number number NOUN ital-3222 79 49 ( ( PUNCT ital-3222 79 50 if if SCONJ ital-3222 79 51 the the DET ital-3222 79 52 number number NOUN ital-3222 79 53 of of ADP ital-3222 79 54 digits digit NOUN ital-3222 79 55 is be AUX ital-3222 79 56 odd odd ADJ ital-3222 79 57 , , PUNCT ital-3222 79 58 the the DET ital-3222 79 59 last last ADJ ital-3222 79 60 byte byte NOUN ital-3222 79 61 contains contain VERB ital-3222 79 62 only only ADV ital-3222 79 63 one one NUM ital-3222 79 64 digit digit NOUN ital-3222 79 65 ) ) PUNCT ital-3222 79 66 . . PUNCT ital-3222 80 1 for for ADP ital-3222 80 2 rtf rtf ADJ ital-3222 80 3 documents document NOUN ital-3222 80 4 , , PUNCT ital-3222 80 5 the the DET ital-3222 80 6 dictionary dictionary NOUN ital-3222 80 7 accepts accept VERB ital-3222 80 8 the the DET ital-3222 80 9 “ " PUNCT ital-3222 80 10 \”-preceded \”-preceded ADJ ital-3222 80 11 words word NOUN ital-3222 80 12 , , PUNCT ital-3222 80 13 like like ADP ital-3222 80 14 the the DET ital-3222 80 15 tex tex PROPN ital-3222 80 16 files file NOUN ital-3222 80 17 . . PUNCT ital-3222 81 1 moreover moreover ADV ital-3222 81 2 , , PUNCT ital-3222 81 3 the the DET ital-3222 81 4 hexadecimal hexadecimal ADJ ital-3222 81 5 numbers number NOUN ital-3222 81 6 are be AUX ital-3222 81 7 encoded encode VERB ital-3222 81 8 in in ADP ital-3222 81 9 the the DET ital-3222 81 10 same same ADJ ital-3222 81 11 way way NOUN ital-3222 81 12 as as ADP ital-3222 81 13 in in ADP ital-3222 81 14 the the DET ital-3222 81 15 ps ps PROPN ital-3222 81 16 subscheme subscheme NOUN ital-3222 81 17 so so SCONJ ital-3222 81 18 that that SCONJ ital-3222 81 19 rtf rtf VERB ital-3222 81 20 documents document NOUN ital-3222 81 21 containing contain VERB ital-3222 81 22 images image NOUN ital-3222 81 23 can can AUX ital-3222 81 24 be be AUX ital-3222 81 25 significantly significantly ADV ital-3222 81 26 reduced reduce VERB ital-3222 81 27 in in ADP ital-3222 81 28 size size NOUN ital-3222 81 29 . . PUNCT ital-3222 82 1 specialization specialization NOUN ital-3222 82 2 for for ADP ital-3222 82 3 xml xml NOUN ital-3222 82 4 is be AUX ital-3222 82 5 roughly roughly ADV ital-3222 82 6 the the DET ital-3222 82 7 transform transform NOUN ital-3222 82 8 described describe VERB ital-3222 82 9 in in ADP ital-3222 82 10 our our PRON ital-3222 82 11 earlier early ADJ ital-3222 82 12 article article NOUN ital-3222 82 13 , , PUNCT ital-3222 82 14 “ " PUNCT ital-3222 82 15 revisiting revisit VERB ital-3222 82 16 dictionarybased dictionarybase VERB ital-3222 82 17 compression compression NOUN ital-3222 82 18 . . PUNCT ital-3222 83 1 ”20 ”20 NOUN ital-3222 83 2 it it PRON ital-3222 83 3 allows allow VERB ital-3222 83 4 for for ADP ital-3222 83 5 xml xml NOUN ital-3222 83 6 start start VERB ital-3222 83 7 tags tag NOUN ital-3222 83 8 and and CCONJ ital-3222 83 9 entities entity NOUN ital-3222 83 10 to to PART ital-3222 83 11 be be AUX ital-3222 83 12 added add VERB ital-3222 83 13 to to ADP ital-3222 83 14 dictionary dictionary PROPN ital-3222 83 15 , , PUNCT ital-3222 83 16 and and CCONJ ital-3222 83 17 it it PRON ital-3222 83 18 replaces replace VERB ital-3222 83 19 every every DET ital-3222 83 20 end end NOUN ital-3222 83 21 tag tag NOUN ital-3222 83 22 respecting respect VERB ital-3222 83 23 the the DET ital-3222 83 24 xml xml NOUN ital-3222 83 25 well well ADV ital-3222 83 26 - - PUNCT ital-3222 83 27 formedness formedness ADJ ital-3222 83 28 rule rule NOUN ital-3222 83 29 ( ( PUNCT ital-3222 83 30 i.e. i.e. X ital-3222 83 31 , , PUNCT ital-3222 83 32 closing close VERB ital-3222 83 33 the the DET ital-3222 83 34 element element NOUN ital-3222 83 35 opened open VERB ital-3222 83 36 most most ADV ital-3222 83 37 recently recently ADV ital-3222 83 38 ) ) PUNCT ital-3222 83 39 with with ADP ital-3222 83 40 a a DET ital-3222 83 41 single single ADJ ital-3222 83 42 flag flag NOUN ital-3222 83 43 . . PUNCT ital-3222 84 1 it it PRON ital-3222 84 2 also also ADV ital-3222 84 3 uses use VERB ital-3222 84 4 a a DET ital-3222 84 5 single single ADJ ital-3222 84 6 flag flag NOUN ital-3222 84 7 to to PART ital-3222 84 8 denote denote VERB ital-3222 84 9 xml xml NOUN ital-3222 84 10 attribute attribute NOUN ital-3222 84 11 value value NOUN ital-3222 84 12 begin begin VERB ital-3222 84 13 and and CCONJ ital-3222 84 14 end end VERB ital-3222 84 15 marks mark NOUN ital-3222 84 16 . . PUNCT ital-3222 85 1 html html PROPN ital-3222 85 2 documents document NOUN ital-3222 85 3 are be AUX ital-3222 85 4 handled handle VERB ital-3222 85 5 similarly similarly ADV ital-3222 85 6 . . PUNCT ital-3222 86 1 the the DET ital-3222 86 2 only only ADJ ital-3222 86 3 difference difference NOUN ital-3222 86 4 is be AUX ital-3222 86 5 that that SCONJ ital-3222 86 6 the the DET ital-3222 86 7 tags tag NOUN ital-3222 86 8 that that PRON ital-3222 86 9 , , PUNCT ital-3222 86 10 according accord VERB ital-3222 86 11 to to ADP ital-3222 86 12 the the DET ital-3222 86 13 html html PROPN ital-3222 86 14 4.01 4.01 NUM ital-3222 86 15 specification specification NOUN ital-3222 86 16 , , PUNCT ital-3222 86 17 are be AUX ital-3222 86 18 not not PART ital-3222 86 19 expected expect VERB ital-3222 86 20 to to PART ital-3222 86 21 be be AUX ital-3222 86 22 followed follow VERB ital-3222 86 23 by by ADP ital-3222 86 24 an an DET ital-3222 86 25 endtag endtag NOUN ital-3222 86 26 ( ( PUNCT ital-3222 86 27 base base NOUN ital-3222 86 28 , , PUNCT ital-3222 86 29 link link NOUN ital-3222 86 30 , , PUNCT ital-3222 86 31 xbasehref xbasehref PROPN ital-3222 86 32 , , PUNCT ital-3222 86 33 br br PROPN ital-3222 86 34 , , PUNCT ital-3222 86 35 meta meta PROPN ital-3222 86 36 , , PUNCT ital-3222 86 37 hr hr NOUN ital-3222 86 38 , , PUNCT ital-3222 86 39 img img ADJ ital-3222 86 40 , , PUNCT ital-3222 86 41 area area NOUN ital-3222 86 42 , , PUNCT ital-3222 86 43 input input NOUN ital-3222 86 44 , , PUNCT ital-3222 86 45 embed embed NOUN ital-3222 86 46 , , PUNCT ital-3222 86 47 param param NOUN ital-3222 86 48 and and CCONJ ital-3222 86 49 col col PROPN ital-3222 86 50 ) ) PUNCT ital-3222 86 51 are be AUX ital-3222 86 52 ignored ignore VERB ital-3222 86 53 by by ADP ital-3222 86 54 the the DET ital-3222 86 55 mechanism mechanism NOUN ital-3222 86 56 replacing replace VERB ital-3222 86 57 closing closing NOUN ital-3222 86 58 tags tag NOUN ital-3222 86 59 ( ( PUNCT ital-3222 86 60 so so SCONJ ital-3222 86 61 that that SCONJ ital-3222 86 62 it it PRON ital-3222 86 63 can can AUX ital-3222 86 64 guess guess VERB ital-3222 86 65 the the DET ital-3222 86 66 correct correct ADJ ital-3222 86 67 closing closing NOUN ital-3222 86 68 tag tag NOUN ital-3222 86 69 even even ADV ital-3222 86 70 after after SCONJ ital-3222 86 71 the the DET ital-3222 86 72 singular singular ADJ ital-3222 86 73 tags tag NOUN ital-3222 86 74 were be AUX ital-3222 86 75 encountered).21 encountered).21 NOUN ital-3222 86 76 n n ADP ital-3222 86 77 using use VERB ital-3222 86 78 the the DET ital-3222 86 79 scheme scheme NOUN ital-3222 86 80 in in ADP ital-3222 86 81 a a DET ital-3222 86 82 digital digital ADJ ital-3222 86 83 library library NOUN ital-3222 86 84 project project NOUN ital-3222 86 85 many many ADJ ital-3222 86 86 textual textual ADJ ital-3222 86 87 digital digital ADJ ital-3222 86 88 libraries library NOUN ital-3222 86 89 seriously seriously ADV ital-3222 86 90 lack lack VERB ital-3222 86 91 text text NOUN ital-3222 86 92 compression compression NOUN ital-3222 86 93 capabilities capability NOUN ital-3222 86 94 , , PUNCT ital-3222 86 95 and and CCONJ ital-3222 86 96 popular popular ADJ ital-3222 86 97 digital digital ADJ ital-3222 86 98 library library NOUN ital-3222 86 99 systems system NOUN ital-3222 86 100 , , PUNCT ital-3222 86 101 such such ADJ ital-3222 86 102 as as ADP ital-3222 86 103 greenstone greenstone NOUN ital-3222 86 104 , , PUNCT ital-3222 86 105 have have VERB ital-3222 86 106 no no DET ital-3222 86 107 embedded embed VERB ital-3222 86 108 efficient efficient ADJ ital-3222 86 109 text text NOUN ital-3222 86 110 compression.22 compression.22 NOUN ital-3222 86 111 therefore therefore ADV ital-3222 86 112 we we PRON ital-3222 86 113 have have AUX ital-3222 86 114 decided decide VERB ital-3222 86 115 to to PART ital-3222 86 116 develop develop VERB ital-3222 86 117 ctdl ctdl NOUN ital-3222 86 118 as as ADP ital-3222 86 119 an an DET ital-3222 86 120 open open ADJ ital-3222 86 121 - - PUNCT ital-3222 86 122 source source NOUN ital-3222 86 123 software software NOUN ital-3222 86 124 library library NOUN ital-3222 86 125 . . PUNCT ital-3222 87 1 the the DET ital-3222 87 2 library library NOUN ital-3222 87 3 is be AUX ital-3222 87 4 free free ADJ ital-3222 87 5 to to PART ital-3222 87 6 use use VERB ital-3222 87 7 and and CCONJ ital-3222 87 8 can can AUX ital-3222 87 9 be be AUX ital-3222 87 10 downloaded download VERB ital-3222 87 11 from from ADP ital-3222 87 12 www.ii.uni.wroc www.ii.uni.wroc PROPN ital-3222 87 13 .pl/~inikep .pl/~inikep SYM ital-3222 87 14 / / SYM ital-3222 87 15 research research NOUN ital-3222 87 16 / / SYM ital-3222 87 17 ctdl ctdl PROPN ital-3222 87 18 / / SYM ital-3222 87 19 ctdl09.zip ctdl09.zip VERB ital-3222 87 20 . . PUNCT ital-3222 88 1 the the DET ital-3222 88 2 library library NOUN ital-3222 88 3 does do AUX ital-3222 88 4 not not PART ital-3222 88 5 require require VERB ital-3222 88 6 any any DET ital-3222 88 7 additional additional ADJ ital-3222 88 8 nonstandard nonstandard ADJ ital-3222 88 9 libraries library NOUN ital-3222 88 10 . . PUNCT ital-3222 89 1 it it PRON ital-3222 89 2 has have VERB ital-3222 89 3 both both CCONJ ital-3222 89 4 the the DET ital-3222 89 5 text text NOUN ital-3222 89 6 transform transform NOUN ital-3222 89 7 and and CCONJ ital-3222 89 8 back back ADJ ital-3222 89 9 - - PUNCT ital-3222 89 10 end end NOUN ital-3222 89 11 compressors compressor NOUN ital-3222 89 12 embedded embed VERB ital-3222 89 13 . . PUNCT ital-3222 90 1 however however ADV ital-3222 90 2 , , PUNCT ital-3222 90 3 compressing compress VERB ital-3222 90 4 pdf pdf NOUN ital-3222 90 5 documents document NOUN ital-3222 90 6 requires require VERB ital-3222 90 7 them they PRON ital-3222 90 8 to to PART ital-3222 90 9 be be AUX ital-3222 90 10 decompressed decompress VERB ital-3222 90 11 first first ADV ital-3222 90 12 with with ADP ital-3222 90 13 the the DET ital-3222 90 14 free free ADJ ital-3222 90 15 precomp precomp NOUN ital-3222 90 16 tool tool NOUN ital-3222 90 17 . . PUNCT ital-3222 91 1 the the DET ital-3222 91 2 compression compression NOUN ital-3222 91 3 routines routine NOUN ital-3222 91 4 are be AUX ital-3222 91 5 wrapped wrap VERB ital-3222 91 6 in in ADP ital-3222 91 7 a a DET ital-3222 91 8 code code NOUN ital-3222 91 9 selecting select VERB ital-3222 91 10 the the DET ital-3222 91 11 best good ADJ ital-3222 91 12 algorithm algorithm NOUN ital-3222 91 13 depending depend VERB ital-3222 91 14 on on ADP ital-3222 91 15 the the DET ital-3222 91 16 chosen choose VERB ital-3222 91 17 compression compression NOUN ital-3222 91 18 mode mode NOUN ital-3222 91 19 and and CCONJ ital-3222 91 20 the the DET ital-3222 91 21 input input NOUN ital-3222 91 22 document document NOUN ital-3222 91 23 format format NOUN ital-3222 91 24 . . PUNCT ital-3222 92 1 the the DET ital-3222 92 2 interface interface NOUN ital-3222 92 3 of of ADP ital-3222 92 4 the the DET ital-3222 92 5 library library NOUN ital-3222 92 6 consists consist VERB ital-3222 92 7 of of ADP ital-3222 92 8 only only ADV ital-3222 92 9 two two NUM ital-3222 92 10 functions function NOUN ital-3222 92 11 : : PUNCT ital-3222 92 12 ctdl_encode ctdl_encode NOUN ital-3222 92 13 and and CCONJ ital-3222 92 14 ctdl_decode ctdl_decode INTJ ital-3222 92 15 , , PUNCT ital-3222 92 16 for for ADP ital-3222 92 17 , , PUNCT ital-3222 92 18 respectively respectively ADV ital-3222 92 19 , , PUNCT ital-3222 92 20 compressing compress VERB ital-3222 92 21 and and CCONJ ital-3222 92 22 decompressing decompress VERB ital-3222 92 23 documents document NOUN ital-3222 92 24 . . PUNCT ital-3222 93 1 ctdl_encode ctdl_encode NOUN ital-3222 93 2 takes take VERB ital-3222 93 3 the the DET ital-3222 93 4 following follow VERB ital-3222 93 5 parameters parameter NOUN ital-3222 93 6 : : PUNCT ital-3222 93 7 n n PROPN ital-3222 93 8 char char PROPN ital-3222 93 9 * * PUNCT ital-3222 93 10 filename filename NOUN ital-3222 93 11 — — PUNCT ital-3222 93 12 name name NOUN ital-3222 93 13 of of ADP ital-3222 93 14 the the DET ital-3222 93 15 input input NOUN ital-3222 93 16 ( ( PUNCT ital-3222 93 17 uncompressed uncompressed ADJ ital-3222 93 18 ) ) PUNCT ital-3222 93 19 document document NOUN ital-3222 93 20 n n NOUN ital-3222 93 21 char char NOUN ital-3222 93 22 * * PUNCT ital-3222 93 23 filename_out filename_out ADP ital-3222 93 24 — — PUNCT ital-3222 93 25 name name NOUN ital-3222 93 26 of of ADP ital-3222 93 27 the the DET ital-3222 93 28 output output NOUN ital-3222 93 29 ( ( PUNCT ital-3222 93 30 compressed compress VERB ital-3222 93 31 ) ) PUNCT ital-3222 93 32 document document NOUN ital-3222 93 33 n n NOUN ital-3222 93 34 efiletype efiletype NOUN ital-3222 93 35 ftype ftype NOUN ital-3222 93 36 — — PUNCT ital-3222 93 37 format format NOUN ital-3222 93 38 of of ADP ital-3222 93 39 the the DET ital-3222 93 40 input input NOUN ital-3222 93 41 document document NOUN ital-3222 93 42 , , PUNCT ital-3222 93 43 defined define VERB ital-3222 93 44 as as ADP ital-3222 93 45 : : PUNCT ital-3222 93 46 enum enum ADJ ital-3222 93 47 efiletype efiletype NOUN ital-3222 93 48 { { PUNCT ital-3222 93 49 html html PROPN ital-3222 93 50 , , PUNCT ital-3222 93 51 pdf pdf PROPN ital-3222 93 52 , , PUNCT ital-3222 93 53 ps ps PROPN ital-3222 93 54 , , PUNCT ital-3222 93 55 rtf rtf PROPN ital-3222 93 56 , , PUNCT ital-3222 93 57 tex tex PROPN ital-3222 93 58 , , PUNCT ital-3222 93 59 txt txt PROPN ital-3222 93 60 , , PUNCT ital-3222 93 61 xml xml PROPN ital-3222 93 62 } } PUNCT ital-3222 93 63 ; ; PUNCT ital-3222 93 64 n n CCONJ ital-3222 93 65 edictionarytype edictionarytype NOUN ital-3222 93 66 dtype dtype NOUN ital-3222 93 67 — — PUNCT ital-3222 93 68 dictionary dictionary ADJ ital-3222 93 69 type type NOUN ital-3222 93 70 , , PUNCT ital-3222 93 71 defined define VERB ital-3222 93 72 as as ADP ital-3222 93 73 : : PUNCT ital-3222 93 74 enum enum ADJ ital-3222 93 75 edictionarytype edictionarytype NOUN ital-3222 93 76 { { PUNCT ital-3222 93 77 static static ADJ ital-3222 93 78 , , PUNCT ital-3222 93 79 semidynamic semidynamic ADJ ital-3222 93 80 } } PUNCT ital-3222 93 81 ; ; PUNCT ital-3222 93 82 ctdl_decode ctdl_decode NOUN ital-3222 93 83 takes take VERB ital-3222 93 84 the the DET ital-3222 93 85 following follow VERB ital-3222 93 86 parameters parameter NOUN ital-3222 93 87 : : PUNCT ital-3222 93 88 n n PROPN ital-3222 93 89 char char PROPN ital-3222 93 90 * * PUNCT ital-3222 93 91 filename filename NOUN ital-3222 93 92 — — PUNCT ital-3222 93 93 name name NOUN ital-3222 93 94 of of ADP ital-3222 93 95 the the DET ital-3222 93 96 input input NOUN ital-3222 93 97 ( ( PUNCT ital-3222 93 98 compressed compress VERB ital-3222 93 99 ) ) PUNCT ital-3222 93 100 document document NOUN ital-3222 93 101 n n NOUN ital-3222 93 102 char char NOUN ital-3222 93 103 * * PUNCT ital-3222 93 104 filename_out filename_out ADP ital-3222 93 105 — — PUNCT ital-3222 93 106 name name NOUN ital-3222 93 107 of of ADP ital-3222 93 108 the the DET ital-3222 93 109 output output NOUN ital-3222 93 110 ( ( PUNCT ital-3222 93 111 decompressed decompress VERB ital-3222 93 112 ) ) PUNCT ital-3222 93 113 document document NOUN ital-3222 93 114 table table NOUN ital-3222 93 115 1 1 NUM ital-3222 93 116 . . NUM ital-3222 93 117 universal universal ADJ ital-3222 93 118 transform transform NOUN ital-3222 93 119 optimizations optimization NOUN ital-3222 93 120 ctdl ctdl PROPN ital-3222 93 121 settings setting NOUN ital-3222 93 122 ctdl+ ctdl+ X ital-3222 93 123 settings setting NOUN ital-3222 93 124 format format NOUN ital-3222 93 125 minfr minfr PROPN ital-3222 93 126 wdspc wdspc PROPN ital-3222 93 127 spruns spruns PROPN ital-3222 93 128 letcnt letcnt ADJ ital-3222 93 129 wdspc wdspc NOUN ital-3222 93 130 spruns sprun NOUN ital-3222 93 131 letcnt letcnt ADJ ital-3222 93 132 html html NOUN ital-3222 93 133 3 3 NUM ital-3222 93 134 + + NUM ital-3222 93 135 + + NUM ital-3222 93 136 + + CCONJ ital-3222 93 137 + + CCONJ ital-3222 93 138 + + CCONJ ital-3222 93 139 pdf pdf VERB ital-3222 93 140 3 3 NUM ital-3222 93 141 ps ps NOUN ital-3222 93 142 6 6 NUM ital-3222 93 143 + + NUM ital-3222 93 144 + + NOUN ital-3222 93 145 rtf rtf VERB ital-3222 93 146 3 3 NUM ital-3222 93 147 + + NUM ital-3222 93 148 + + NUM ital-3222 93 149 + + CCONJ ital-3222 93 150 tex tex NOUN ital-3222 93 151 3 3 NUM ital-3222 93 152 + + NUM ital-3222 93 153 + + NUM ital-3222 93 154 + + CCONJ ital-3222 93 155 + + CCONJ ital-3222 93 156 + + CCONJ ital-3222 93 157 + + NUM ital-3222 93 158 txt txt NOUN ital-3222 93 159 6 6 NUM ital-3222 93 160 + + NUM ital-3222 93 161 + + NUM ital-3222 93 162 + + CCONJ ital-3222 93 163 + + CCONJ ital-3222 93 164 + + CCONJ ital-3222 93 165 + + CCONJ ital-3222 93 166 xml xml NOUN ital-3222 93 167 3 3 NUM ital-3222 93 168 + + NUM ital-3222 93 169 + + NUM ital-3222 93 170 + + CCONJ ital-3222 93 171 + + CCONJ ital-3222 93 172 + + CCONJ ital-3222 93 173 the the DET ital-3222 93 174 efficient efficient ADJ ital-3222 93 175 storage storage NOUN ital-3222 93 176 of of ADP ital-3222 93 177 text text NOUN ital-3222 93 178 documents document NOUN ital-3222 93 179 in in ADP ital-3222 93 180 digital digital ADJ ital-3222 93 181 libraries library NOUN ital-3222 93 182 | | NOUN ital-3222 93 183 skibiński skibiński NOUN ital-3222 93 184 and and CCONJ ital-3222 93 185 swacha swacha PROPN ital-3222 93 186 147 147 NUM ital-3222 93 187 the the DET ital-3222 93 188 library library NOUN ital-3222 93 189 was be AUX ital-3222 93 190 written write VERB ital-3222 93 191 in in ADP ital-3222 93 192 the the DET ital-3222 93 193 c++ c++ PRON ital-3222 93 194 programming programming NOUN ital-3222 93 195 language language NOUN ital-3222 93 196 , , PUNCT ital-3222 93 197 but but CCONJ ital-3222 93 198 a a DET ital-3222 93 199 compiled compile VERB ital-3222 93 200 static static ADJ ital-3222 93 201 library library NOUN ital-3222 93 202 is be AUX ital-3222 93 203 also also ADV ital-3222 93 204 distributed distribute VERB ital-3222 93 205 ; ; PUNCT ital-3222 93 206 thus thus ADV ital-3222 93 207 it it PRON ital-3222 93 208 can can AUX ital-3222 93 209 be be AUX ital-3222 93 210 used use VERB ital-3222 93 211 in in ADP ital-3222 93 212 any any DET ital-3222 93 213 language language NOUN ital-3222 93 214 that that PRON ital-3222 93 215 can can AUX ital-3222 93 216 link link VERB ital-3222 93 217 such such ADJ ital-3222 93 218 libraries library NOUN ital-3222 93 219 . . PUNCT ital-3222 94 1 currently currently ADV ital-3222 94 2 , , PUNCT ital-3222 94 3 the the DET ital-3222 94 4 library library NOUN ital-3222 94 5 is be AUX ital-3222 94 6 compatible compatible ADJ ital-3222 94 7 with with ADP ital-3222 94 8 two two NUM ital-3222 94 9 platforms platform NOUN ital-3222 94 10 : : PUNCT ital-3222 94 11 microsoft microsoft PROPN ital-3222 94 12 windows window NOUN ital-3222 94 13 and and CCONJ ital-3222 94 14 linux linux PROPN ital-3222 94 15 . . PUNCT ital-3222 95 1 to to PART ital-3222 95 2 use use VERB ital-3222 95 3 static static ADJ ital-3222 95 4 dictionaries dictionary NOUN ital-3222 95 5 , , PUNCT ital-3222 95 6 the the DET ital-3222 95 7 respective respective ADJ ital-3222 95 8 dictionary dictionary ADJ ital-3222 95 9 file file NOUN ital-3222 95 10 must must AUX ital-3222 95 11 be be AUX ital-3222 95 12 available available ADJ ital-3222 95 13 . . PUNCT ital-3222 96 1 the the DET ital-3222 96 2 library library NOUN ital-3222 96 3 is be AUX ital-3222 96 4 supplied supply VERB ital-3222 96 5 with with ADP ital-3222 96 6 an an DET ital-3222 96 7 english english PROPN ital-3222 96 8 dictionary dictionary NOUN ital-3222 96 9 trained train VERB ital-3222 96 10 on on ADP ital-3222 96 11 a a DET ital-3222 96 12 3 3 NUM ital-3222 96 13 gb gb PROPN ital-3222 96 14 text text NOUN ital-3222 96 15 corpus corpus NOUN ital-3222 96 16 from from ADP ital-3222 96 17 project project PROPN ital-3222 96 18 gutenberg.23 gutenberg.23 VERB ital-3222 96 19 seven seven NUM ital-3222 96 20 other other ADJ ital-3222 96 21 dictionaries dictionary NOUN ital-3222 96 22 — — PUNCT ital-3222 96 23 german german ADJ ital-3222 96 24 , , PUNCT ital-3222 96 25 spanish spanish ADJ ital-3222 96 26 , , PUNCT ital-3222 96 27 finnish finnish ADJ ital-3222 96 28 , , PUNCT ital-3222 96 29 french french ADJ ital-3222 96 30 , , PUNCT ital-3222 96 31 italian italian ADJ ital-3222 96 32 , , PUNCT ital-3222 96 33 polish polish ADJ ital-3222 96 34 , , PUNCT ital-3222 96 35 and and CCONJ ital-3222 96 36 russian russian ADJ ital-3222 96 37 — — PUNCT ital-3222 96 38 can can AUX ital-3222 96 39 be be AUX ital-3222 96 40 freely freely ADV ital-3222 96 41 downloaded download VERB ital-3222 96 42 from from ADP ital-3222 96 43 www.ii.uni.wroc.pl/~inikep/ www.ii.uni.wroc.pl/~inikep/ NOUN ital-3222 96 44 research research NOUN ital-3222 96 45 / / SYM ital-3222 96 46 dicts dict NOUN ital-3222 96 47 . . PUNCT ital-3222 97 1 there there PRON ital-3222 97 2 also also ADV ital-3222 97 3 is be AUX ital-3222 97 4 a a DET ital-3222 97 5 tool tool NOUN ital-3222 97 6 that that PRON ital-3222 97 7 helps help VERB ital-3222 97 8 create create VERB ital-3222 97 9 a a DET ital-3222 97 10 new new ADJ ital-3222 97 11 dictionary dictionary NOUN ital-3222 97 12 from from ADP ital-3222 97 13 any any DET ital-3222 97 14 given give VERB ital-3222 97 15 corpus corpus NOUN ital-3222 97 16 of of ADP ital-3222 97 17 documents document NOUN ital-3222 97 18 , , PUNCT ital-3222 97 19 available available ADJ ital-3222 97 20 from from ADP ital-3222 97 21 skibiński skibiński ADJ ital-3222 97 22 upon upon SCONJ ital-3222 97 23 request request NOUN ital-3222 97 24 via via ADP ital-3222 97 25 e e NOUN ital-3222 97 26 - - NOUN ital-3222 97 27 mail mail NOUN ital-3222 97 28 ( ( PUNCT ital-3222 97 29 inikep@ii.uni inikep@ii.uni PROPN ital-3222 97 30 .wroc.pl .wroc.pl X ital-3222 97 31 ) ) PUNCT ital-3222 97 32 . . PUNCT ital-3222 98 1 the the DET ital-3222 98 2 library library NOUN ital-3222 98 3 can can AUX ital-3222 98 4 be be AUX ital-3222 98 5 used use VERB ital-3222 98 6 to to PART ital-3222 98 7 reduce reduce VERB ital-3222 98 8 the the DET ital-3222 98 9 storage storage NOUN ital-3222 98 10 requirements requirement NOUN ital-3222 98 11 or or CCONJ ital-3222 98 12 also also ADV ital-3222 98 13 to to PART ital-3222 98 14 reduce reduce VERB ital-3222 98 15 the the DET ital-3222 98 16 time time NOUN ital-3222 98 17 of of ADP ital-3222 98 18 delivering deliver VERB ital-3222 98 19 a a DET ital-3222 98 20 requested request VERB ital-3222 98 21 document document NOUN ital-3222 98 22 to to ADP ital-3222 98 23 the the DET ital-3222 98 24 library library NOUN ital-3222 98 25 user user NOUN ital-3222 98 26 . . PUNCT ital-3222 99 1 in in ADP ital-3222 99 2 the the DET ital-3222 99 3 first first ADJ ital-3222 99 4 case case NOUN ital-3222 99 5 , , PUNCT ital-3222 99 6 the the DET ital-3222 99 7 decompression decompression NOUN ital-3222 99 8 must must AUX ital-3222 99 9 be be AUX ital-3222 99 10 done do VERB ital-3222 99 11 on on ADP ital-3222 99 12 the the DET ital-3222 99 13 server server NOUN ital-3222 99 14 side side NOUN ital-3222 99 15 . . PUNCT ital-3222 100 1 in in ADP ital-3222 100 2 the the DET ital-3222 100 3 second second ADJ ital-3222 100 4 case case NOUN ital-3222 100 5 , , PUNCT ital-3222 100 6 it it PRON ital-3222 100 7 must must AUX ital-3222 100 8 be be AUX ital-3222 100 9 done do VERB ital-3222 100 10 on on ADP ital-3222 100 11 the the DET ital-3222 100 12 client client NOUN ital-3222 100 13 side side NOUN ital-3222 100 14 , , PUNCT ital-3222 100 15 which which PRON ital-3222 100 16 is be AUX ital-3222 100 17 possible possible ADJ ital-3222 100 18 because because SCONJ ital-3222 100 19 stand stand VERB ital-3222 100 20 - - PUNCT ital-3222 100 21 alone alone ADJ ital-3222 100 22 decompressors decompressor NOUN ital-3222 100 23 are be AUX ital-3222 100 24 available available ADJ ital-3222 100 25 for for ADP ital-3222 100 26 microsoft microsoft PROPN ital-3222 100 27 windows window NOUN ital-3222 100 28 and and CCONJ ital-3222 100 29 linux linux PROPN ital-3222 100 30 . . PUNCT ital-3222 101 1 obviously obviously ADV ital-3222 101 2 , , PUNCT ital-3222 101 3 a a DET ital-3222 101 4 library library NOUN ital-3222 101 5 can can AUX ital-3222 101 6 support support VERB ital-3222 101 7 both both DET ital-3222 101 8 options option NOUN ital-3222 101 9 by by ADP ital-3222 101 10 providing provide VERB ital-3222 101 11 the the DET ital-3222 101 12 user user NOUN ital-3222 101 13 with with ADP ital-3222 101 14 a a DET ital-3222 101 15 choice choice NOUN ital-3222 101 16 whether whether SCONJ ital-3222 101 17 a a DET ital-3222 101 18 document document NOUN ital-3222 101 19 should should AUX ital-3222 101 20 be be AUX ital-3222 101 21 delivered deliver VERB ital-3222 101 22 compressed compressed ADJ ital-3222 101 23 or or CCONJ ital-3222 101 24 not not PART ital-3222 101 25 . . PUNCT ital-3222 102 1 if if SCONJ ital-3222 102 2 documents document NOUN ital-3222 102 3 are be AUX ital-3222 102 4 to to PART ital-3222 102 5 be be AUX ital-3222 102 6 decompressed decompress VERB ital-3222 102 7 client client NOUN ital-3222 102 8 - - PUNCT ital-3222 102 9 side side NOUN ital-3222 102 10 , , PUNCT ital-3222 102 11 the the DET ital-3222 102 12 basic basic ADJ ital-3222 102 13 ctdl ctdl NOUN ital-3222 102 14 , , PUNCT ital-3222 102 15 using use VERB ital-3222 102 16 a a DET ital-3222 102 17 semidynamic semidynamic ADJ ital-3222 102 18 dictionary dictionary NOUN ital-3222 102 19 , , PUNCT ital-3222 102 20 seems seem VERB ital-3222 102 21 handier handy ADJ ital-3222 102 22 , , PUNCT ital-3222 102 23 since since SCONJ ital-3222 102 24 it it PRON ital-3222 102 25 does do AUX ital-3222 102 26 not not PART ital-3222 102 27 require require VERB ital-3222 102 28 the the DET ital-3222 102 29 user user NOUN ital-3222 102 30 to to PART ital-3222 102 31 obtain obtain VERB ital-3222 102 32 the the DET ital-3222 102 33 static static ADJ ital-3222 102 34 dictionary dictionary NOUN ital-3222 102 35 that that PRON ital-3222 102 36 was be AUX ital-3222 102 37 used use VERB ital-3222 102 38 to to PART ital-3222 102 39 compress compress VERB ital-3222 102 40 the the DET ital-3222 102 41 downloaded download VERB ital-3222 102 42 document document NOUN ital-3222 102 43 . . PUNCT ital-3222 103 1 still still ADV ital-3222 103 2 , , PUNCT ital-3222 103 3 the the DET ital-3222 103 4 size size NOUN ital-3222 103 5 of of ADP ital-3222 103 6 such such DET ital-3222 103 7 a a DET ital-3222 103 8 dictionary dictionary NOUN ital-3222 103 9 is be AUX ital-3222 103 10 usually usually ADV ital-3222 103 11 small small ADJ ital-3222 103 12 , , PUNCT ital-3222 103 13 so so ADV ital-3222 103 14 it it PRON ital-3222 103 15 does do AUX ital-3222 103 16 not not PART ital-3222 103 17 disqualify disqualify VERB ital-3222 103 18 ctdl+ ctdl+ ADV ital-3222 103 19 from from ADP ital-3222 103 20 this this DET ital-3222 103 21 kind kind NOUN ital-3222 103 22 of of ADP ital-3222 103 23 use use NOUN ital-3222 103 24 . . PUNCT ital-3222 104 1 n n PROPN ital-3222 104 2 experimental experimental ADJ ital-3222 104 3 results result NOUN ital-3222 104 4 we we PRON ital-3222 104 5 tested test VERB ital-3222 104 6 ctdl ctdl NOUN ital-3222 104 7 experimentally experimentally ADV ital-3222 104 8 on on ADP ital-3222 104 9 a a DET ital-3222 104 10 benchmark benchmark NOUN ital-3222 104 11 set set NOUN ital-3222 104 12 of of ADP ital-3222 104 13 text text NOUN ital-3222 104 14 documents document NOUN ital-3222 104 15 . . PUNCT ital-3222 105 1 the the DET ital-3222 105 2 purpose purpose NOUN ital-3222 105 3 of of ADP ital-3222 105 4 the the DET ital-3222 105 5 tests test NOUN ital-3222 105 6 was be AUX ital-3222 105 7 to to PART ital-3222 105 8 compare compare VERB ital-3222 105 9 the the DET ital-3222 105 10 storage storage NOUN ital-3222 105 11 requirements requirement NOUN ital-3222 105 12 of of ADP ital-3222 105 13 different different ADJ ital-3222 105 14 document document NOUN ital-3222 105 15 formats format NOUN ital-3222 105 16 in in ADP ital-3222 105 17 compressed compressed ADJ ital-3222 105 18 and and CCONJ ital-3222 105 19 uncompressed uncompressed ADJ ital-3222 105 20 form form NOUN ital-3222 105 21 . . PUNCT ital-3222 106 1 in in ADP ital-3222 106 2 selecting select VERB ital-3222 106 3 the the DET ital-3222 106 4 test test NOUN ital-3222 106 5 files file NOUN ital-3222 106 6 we we PRON ital-3222 106 7 wanted want VERB ital-3222 106 8 to to PART ital-3222 106 9 achieve achieve VERB ital-3222 106 10 the the DET ital-3222 106 11 following following ADJ ital-3222 106 12 goals goal NOUN ital-3222 106 13 : : PUNCT ital-3222 106 14 n n X ital-3222 106 15 test test VERB ital-3222 106 16 all all DET ital-3222 106 17 the the DET ital-3222 106 18 formats format NOUN ital-3222 106 19 listed list VERB ital-3222 106 20 in in ADP ital-3222 106 21 table table NOUN ital-3222 106 22 1 1 NUM ital-3222 107 1 ( ( PUNCT ital-3222 107 2 therefore therefore ADV ital-3222 107 3 we we PRON ital-3222 107 4 decided decide VERB ital-3222 107 5 to to PART ital-3222 107 6 choose choose VERB ital-3222 107 7 documents document NOUN ital-3222 107 8 that that PRON ital-3222 107 9 produced produce VERB ital-3222 107 10 no no DET ital-3222 107 11 errors error NOUN ital-3222 107 12 during during ADP ital-3222 107 13 document document NOUN ital-3222 107 14 format format NOUN ital-3222 107 15 conversion conversion NOUN ital-3222 107 16 ) ) PUNCT ital-3222 107 17 n n CCONJ ital-3222 107 18 obtain obtain VERB ital-3222 107 19 verifiable verifiable ADJ ital-3222 107 20 results result NOUN ital-3222 107 21 ( ( PUNCT ital-3222 107 22 therefore therefore ADV ital-3222 107 23 we we PRON ital-3222 107 24 decided decide VERB ital-3222 107 25 to to PART ital-3222 107 26 use use VERB ital-3222 107 27 documents document NOUN ital-3222 107 28 that that PRON ital-3222 107 29 can can AUX ital-3222 107 30 be be AUX ital-3222 107 31 easily easily ADV ital-3222 107 32 obtained obtain VERB ital-3222 107 33 from from ADP ital-3222 107 34 the the DET ital-3222 107 35 internet internet NOUN ital-3222 107 36 ) ) PUNCT ital-3222 107 37 n n CCONJ ital-3222 107 38 measure measure VERB ital-3222 107 39 the the DET ital-3222 107 40 actual actual ADJ ital-3222 107 41 compression compression NOUN ital-3222 107 42 improvement improvement NOUN ital-3222 107 43 from from ADP ital-3222 107 44 applying apply VERB ital-3222 107 45 the the DET ital-3222 107 46 proposed propose VERB ital-3222 107 47 scheme scheme NOUN ital-3222 107 48 ( ( PUNCT ital-3222 107 49 apart apart ADV ital-3222 107 50 from from ADP ital-3222 107 51 the the DET ital-3222 107 52 rtf rtf ADJ ital-3222 107 53 format format NOUN ital-3222 107 54 , , PUNCT ital-3222 107 55 the the DET ital-3222 107 56 scheme scheme NOUN ital-3222 107 57 is be AUX ital-3222 107 58 neutral neutral ADJ ital-3222 107 59 to to ADP ital-3222 107 60 the the DET ital-3222 107 61 images image NOUN ital-3222 107 62 embedded embed VERB ital-3222 107 63 in in ADP ital-3222 107 64 documents document NOUN ital-3222 107 65 ; ; PUNCT ital-3222 107 66 therefore therefore ADV ital-3222 107 67 we we PRON ital-3222 107 68 decided decide VERB ital-3222 107 69 to to PART ital-3222 107 70 use use VERB ital-3222 107 71 documents document NOUN ital-3222 107 72 that that PRON ital-3222 107 73 have have VERB ital-3222 107 74 no no DET ital-3222 107 75 embedded embed VERB ital-3222 107 76 images image NOUN ital-3222 107 77 ) ) PUNCT ital-3222 107 78 for for ADP ital-3222 107 79 these these DET ital-3222 107 80 reasons reason NOUN ital-3222 107 81 , , PUNCT ital-3222 107 82 we we PRON ital-3222 107 83 used use VERB ital-3222 107 84 the the DET ital-3222 107 85 following follow VERB ital-3222 107 86 procedure procedure NOUN ital-3222 107 87 for for ADP ital-3222 107 88 selecting select VERB ital-3222 107 89 documents document NOUN ital-3222 107 90 to to ADP ital-3222 107 91 the the DET ital-3222 107 92 test test NOUN ital-3222 107 93 set set VERB ital-3222 107 94 . . PUNCT ital-3222 108 1 first first ADV ital-3222 108 2 , , PUNCT ital-3222 108 3 we we PRON ital-3222 108 4 searched search VERB ital-3222 108 5 the the DET ital-3222 108 6 project project NOUN ital-3222 108 7 gutenberg gutenberg PROPN ital-3222 108 8 library library NOUN ital-3222 108 9 for for ADP ital-3222 108 10 tex tex PROPN ital-3222 108 11 documents document NOUN ital-3222 108 12 , , PUNCT ital-3222 108 13 as as SCONJ ital-3222 108 14 this this DET ital-3222 108 15 format format NOUN ital-3222 108 16 can can AUX ital-3222 108 17 most most ADV ital-3222 108 18 reliably reliably ADV ital-3222 108 19 be be AUX ital-3222 108 20 transformed transform VERB ital-3222 108 21 into into ADP ital-3222 108 22 the the DET ital-3222 108 23 other other ADJ ital-3222 108 24 formats format NOUN ital-3222 108 25 . . PUNCT ital-3222 109 1 from from ADP ital-3222 109 2 the the DET ital-3222 109 3 fifty fifty NUM ital-3222 109 4 - - PUNCT ital-3222 109 5 one one NUM ital-3222 109 6 retrieved retrieve VERB ital-3222 109 7 documents document NOUN ital-3222 109 8 , , PUNCT ital-3222 109 9 we we PRON ital-3222 109 10 removed remove VERB ital-3222 109 11 all all DET ital-3222 109 12 those those DET ital-3222 109 13 containing contain VERB ital-3222 109 14 images image NOUN ital-3222 109 15 as as ADV ital-3222 109 16 well well ADV ital-3222 109 17 as as ADP ital-3222 109 18 those those PRON ital-3222 109 19 that that PRON ital-3222 109 20 the the DET ital-3222 109 21 htlatex htlatex NOUN ital-3222 109 22 tool tool NOUN ital-3222 109 23 failed fail VERB ital-3222 109 24 to to PART ital-3222 109 25 convert convert VERB ital-3222 109 26 to to ADP ital-3222 109 27 html html PROPN ital-3222 109 28 . . PUNCT ital-3222 110 1 in in ADP ital-3222 110 2 the the DET ital-3222 110 3 eleven eleven NUM ital-3222 110 4 remaining remain VERB ital-3222 110 5 documents document NOUN ital-3222 110 6 , , PUNCT ital-3222 110 7 there there PRON ital-3222 110 8 were be VERB ital-3222 110 9 four four NUM ital-3222 110 10 jane jane ADJ ital-3222 110 11 austen austen ADJ ital-3222 110 12 books book NOUN ital-3222 110 13 ; ; PUNCT ital-3222 110 14 this this DET ital-3222 110 15 overrepresentation overrepresentation NOUN ital-3222 110 16 was be AUX ital-3222 110 17 handled handle VERB ital-3222 110 18 by by ADP ital-3222 110 19 removing remove VERB ital-3222 110 20 three three NUM ital-3222 110 21 of of ADP ital-3222 110 22 them they PRON ital-3222 110 23 . . PUNCT ital-3222 111 1 the the DET ital-3222 111 2 resulting result VERB ital-3222 111 3 eight eight NUM ital-3222 111 4 documents document NOUN ital-3222 111 5 are be AUX ital-3222 111 6 given give VERB ital-3222 111 7 in in ADP ital-3222 111 8 table table NOUN ital-3222 111 9 2 2 NUM ital-3222 111 10 . . PUNCT ital-3222 111 11 from from ADP ital-3222 111 12 the the DET ital-3222 111 13 tex tex NOUN ital-3222 111 14 files file NOUN ital-3222 111 15 we we PRON ital-3222 111 16 generated generate VERB ital-3222 111 17 html html PROPN ital-3222 111 18 , , PUNCT ital-3222 111 19 pdf pdf NOUN ital-3222 111 20 , , PUNCT ital-3222 111 21 and and CCONJ ital-3222 111 22 ps ps NOUN ital-3222 111 23 documents document NOUN ital-3222 111 24 . . PUNCT ital-3222 112 1 then then ADV ital-3222 112 2 we we PRON ital-3222 112 3 used use VERB ital-3222 112 4 word word NOUN ital-3222 112 5 2007 2007 NUM ital-3222 112 6 to to PART ital-3222 112 7 transform transform VERB ital-3222 112 8 html html PROPN ital-3222 112 9 documents document NOUN ital-3222 112 10 into into ADP ital-3222 112 11 rtf rtf PROPN ital-3222 112 12 , , PUNCT ital-3222 112 13 doc doc PROPN ital-3222 112 14 , , PUNCT ital-3222 112 15 and and CCONJ ital-3222 112 16 xml xml PROPN ital-3222 113 1 ( ( PUNCT ital-3222 113 2 thus thus ADV ital-3222 113 3 this this PRON ital-3222 113 4 is be AUX ital-3222 113 5 the the DET ital-3222 113 6 microsoft microsoft PROPN ital-3222 113 7 word word NOUN ital-3222 113 8 xml xml PROPN ital-3222 113 9 format format NOUN ital-3222 113 10 , , PUNCT ital-3222 113 11 not not PART ital-3222 113 12 the the DET ital-3222 113 13 project project NOUN ital-3222 113 14 gutenberg gutenberg PROPN ital-3222 113 15 xml xml PROPN ital-3222 113 16 format format NOUN ital-3222 113 17 ) ) PUNCT ital-3222 113 18 . . PUNCT ital-3222 114 1 the the DET ital-3222 114 2 txt txt ADJ ital-3222 114 3 files file NOUN ital-3222 114 4 were be AUX ital-3222 114 5 downloaded download VERB ital-3222 114 6 from from ADP ital-3222 114 7 project project PROPN ital-3222 114 8 gutenberg gutenberg PROPN ital-3222 114 9 . . PUNCT ital-3222 115 1 the the DET ital-3222 115 2 tests test NOUN ital-3222 115 3 were be AUX ital-3222 115 4 conducted conduct VERB ital-3222 115 5 on on ADP ital-3222 115 6 a a DET ital-3222 115 7 low low ADJ ital-3222 115 8 - - PUNCT ital-3222 115 9 end end NOUN ital-3222 115 10 amd amd NOUN ital-3222 115 11 sempron sempron NOUN ital-3222 115 12 3000 3000 NUM ital-3222 115 13 + + NUM ital-3222 115 14 1.80 1.80 NUM ital-3222 115 15 ghz ghz NOUN ital-3222 115 16 system system NOUN ital-3222 115 17 with with ADP ital-3222 115 18 512 512 NUM ital-3222 115 19 mb mb NOUN ital-3222 115 20 ram ram NOUN ital-3222 115 21 and and CCONJ ital-3222 115 22 a a DET ital-3222 115 23 seagate seagate ADJ ital-3222 115 24 80 80 NUM ital-3222 115 25 gb gb PROPN ital-3222 115 26 ata ata PROPN ital-3222 115 27 drive drive NOUN ital-3222 115 28 , , PUNCT ital-3222 115 29 running run VERB ital-3222 115 30 windows window NOUN ital-3222 115 31 xp xp PROPN ital-3222 115 32 sp2 sp2 NOUN ital-3222 115 33 . . PUNCT ital-3222 116 1 for for ADP ital-3222 116 2 comparison comparison NOUN ital-3222 116 3 purposes purpose NOUN ital-3222 116 4 , , PUNCT ital-3222 116 5 we we PRON ital-3222 116 6 used use VERB ital-3222 116 7 three three NUM ital-3222 116 8 generalpurpose generalpurpose NOUN ital-3222 116 9 compression compression NOUN ital-3222 116 10 programs program NOUN ital-3222 116 11 : : PUNCT ital-3222 116 12 n n ADV ital-3222 116 13 gzip gzip NOUN ital-3222 116 14 implementing implement VERB ital-3222 116 15 deflate deflate NOUN ital-3222 116 16 n n NOUN ital-3222 116 17 bzip2 bzip2 NOUN ital-3222 116 18 implementing implement VERB ital-3222 116 19 a a DET ital-3222 116 20 bwt bwt PROPN ital-3222 116 21 - - PUNCT ital-3222 116 22 based base VERB ital-3222 116 23 compression compression NOUN ital-3222 116 24 algorithm algorithm PROPN ital-3222 116 25 table table NOUN ital-3222 116 26 2 2 NUM ital-3222 116 27 . . PROPN ital-3222 116 28 test test NOUN ital-3222 116 29 set set VERB ital-3222 116 30 documents document NOUN ital-3222 116 31 specification specification NOUN ital-3222 116 32 file file NOUN ital-3222 116 33 name name NOUN ital-3222 116 34 title title NOUN ital-3222 116 35 author author NOUN ital-3222 116 36 tex tex NOUN ital-3222 116 37 size size NOUN ital-3222 116 38 ( ( PUNCT ital-3222 116 39 bytes byte NOUN ital-3222 116 40 ) ) PUNCT ital-3222 116 41 13601 13601 NUM ital-3222 116 42 - - PUNCT ital-3222 116 43 t t NOUN ital-3222 116 44 expositions exposition NOUN ital-3222 116 45 of of ADP ital-3222 116 46 holy holy ADJ ital-3222 116 47 scripture scripture NOUN ital-3222 116 48 : : PUNCT ital-3222 116 49 romans roman NOUN ital-3222 116 50 corinthians corinthian NOUN ital-3222 116 51 maclaren maclaren VERB ital-3222 116 52 1,443,056 1,443,056 NUM ital-3222 116 53 16514 16514 NUM ital-3222 116 54 - - PUNCT ital-3222 116 55 t t NOUN ital-3222 116 56 a a DET ital-3222 116 57 little little ADJ ital-3222 116 58 cook cook NOUN ital-3222 116 59 book book NOUN ital-3222 116 60 for for ADP ital-3222 116 61 a a DET ital-3222 116 62 little little ADJ ital-3222 116 63 girl girl NOUN ital-3222 116 64 benton benton NOUN ital-3222 116 65 220,480 220,480 NUM ital-3222 116 66 1noam10 1noam10 PROPN ital-3222 116 67 t t PROPN ital-3222 116 68 north north PROPN ital-3222 116 69 america america PROPN ital-3222 116 70 , , PUNCT ital-3222 116 71 v. v. PROPN ital-3222 116 72 1 1 NUM ital-3222 116 73 trollope trollope PROPN ital-3222 116 74 804,813 804,813 NUM ital-3222 116 75 2ws2610 2ws2610 NUM ital-3222 116 76 hamlet hamlet NOUN ital-3222 116 77 shakespeare shakespeare PROPN ital-3222 116 78 194,527 194,527 NUM ital-3222 116 79 alice30 alice30 ADJ ital-3222 116 80 alice alice NOUN ital-3222 116 81 in in ADP ital-3222 116 82 wonderland wonderland PROPN ital-3222 116 83 carroll carroll PROPN ital-3222 116 84 165,844 165,844 NUM ital-3222 116 85 cdscs10 cdscs10 ADJ ital-3222 116 86 t t NOUN ital-3222 116 87 some some DET ital-3222 116 88 christmas christmas PROPN ital-3222 116 89 stories story NOUN ital-3222 116 90 dickens dicken VERB ital-3222 116 91 127,684 127,684 NUM ital-3222 116 92 grimm10 grimm10 NOUN ital-3222 116 93 t t NOUN ital-3222 116 94 fairy fairy NOUN ital-3222 116 95 tales tale NOUN ital-3222 116 96 grimm grimm VERB ital-3222 116 97 535,842 535,842 NUM ital-3222 116 98 pandp12 pandp12 PUNCT ital-3222 116 99 t t NOUN ital-3222 116 100 pride pride NOUN ital-3222 116 101 and and CCONJ ital-3222 116 102 prejudice prejudice NOUN ital-3222 116 103 austen austen VERB ital-3222 116 104 727,415 727,415 NUM ital-3222 116 105 148 148 NUM ital-3222 116 106 information information NOUN ital-3222 116 107 technology technology NOUN ital-3222 116 108 and and CCONJ ital-3222 116 109 libraries library NOUN ital-3222 116 110 | | NOUN ital-3222 116 111 september september PROPN ital-3222 116 112 2009 2009 NUM ital-3222 116 113 n n CCONJ ital-3222 116 114 ppmvc ppmvc NOUN ital-3222 116 115 implementing implement VERB ital-3222 116 116 a a DET ital-3222 116 117 ppm ppm ADV ital-3222 116 118 - - PUNCT ital-3222 116 119 derived derive VERB ital-3222 116 120 compression compression NOUN ital-3222 116 121 algorithm24 algorithm24 NOUN ital-3222 116 122 tables table NOUN ital-3222 117 1 3–10 3–10 NUM ital-3222 117 2 show show NOUN ital-3222 117 3 n n ADP ital-3222 117 4 the the DET ital-3222 117 5 bitrate bitrate NOUN ital-3222 117 6 attained attain VERB ital-3222 117 7 on on ADP ital-3222 117 8 each each DET ital-3222 117 9 test test NOUN ital-3222 117 10 file file NOUN ital-3222 117 11 by by ADP ital-3222 117 12 the the DET ital-3222 117 13 deflatebased deflatebase VERB ital-3222 117 14 gzip gzip NOUN ital-3222 117 15 in in ADP ital-3222 117 16 default default NOUN ital-3222 117 17 mode mode NOUN ital-3222 117 18 , , PUNCT ital-3222 117 19 the the DET ital-3222 117 20 proposed propose VERB ital-3222 117 21 compression compression NOUN ital-3222 117 22 scheme scheme NOUN ital-3222 117 23 in in ADP ital-3222 117 24 the the DET ital-3222 117 25 semidynamic semidynamic ADJ ital-3222 117 26 and and CCONJ ital-3222 117 27 static static ADJ ital-3222 117 28 variants variant NOUN ital-3222 117 29 with with ADP ital-3222 117 30 deflate deflate NOUN ital-3222 117 31 as as ADP ital-3222 117 32 the the DET ital-3222 117 33 back back ADJ ital-3222 117 34 - - PUNCT ital-3222 117 35 end end NOUN ital-3222 117 36 compression compression NOUN ital-3222 117 37 algorithm algorithm NOUN ital-3222 117 38 , , PUNCT ital-3222 117 39 7 7 NUM ital-3222 117 40 - - SYM ital-3222 117 41 zip zip NOUN ital-3222 117 42 in in ADP ital-3222 117 43 lzma lzma PROPN ital-3222 117 44 mode mode NOUN ital-3222 117 45 , , PUNCT ital-3222 117 46 the the DET ital-3222 117 47 proposed propose VERB ital-3222 117 48 compression compression NOUN ital-3222 117 49 scheme scheme NOUN ital-3222 117 50 in in ADP ital-3222 117 51 the the DET ital-3222 117 52 semidynamic semidynamic ADJ ital-3222 117 53 and and CCONJ ital-3222 117 54 static static ADJ ital-3222 117 55 variants variant NOUN ital-3222 117 56 with with ADP ital-3222 117 57 lzma lzma PROPN ital-3222 117 58 as as ADP ital-3222 117 59 the the DET ital-3222 117 60 back back ADJ ital-3222 117 61 - - PUNCT ital-3222 117 62 end end NOUN ital-3222 117 63 compression compression NOUN ital-3222 117 64 algorithm algorithm NOUN ital-3222 117 65 , , PUNCT ital-3222 117 66 bzip2 bzip2 NOUN ital-3222 117 67 and and CCONJ ital-3222 117 68 ppmvc ppmvc NOUN ital-3222 117 69 ; ; PUNCT ital-3222 117 70 n n CCONJ ital-3222 117 71 the the DET ital-3222 117 72 average average ADJ ital-3222 117 73 bitrate bitrate NOUN ital-3222 117 74 attained attain VERB ital-3222 117 75 on on ADP ital-3222 117 76 the the DET ital-3222 117 77 whole whole ADJ ital-3222 117 78 test test NOUN ital-3222 117 79 corpus corpus NOUN ital-3222 117 80 ; ; PUNCT ital-3222 117 81 and and CCONJ ital-3222 117 82 n n ADV ital-3222 117 83 the the DET ital-3222 117 84 total total ADJ ital-3222 117 85 compression compression NOUN ital-3222 117 86 and and CCONJ ital-3222 117 87 decompression decompression NOUN ital-3222 117 88 times time NOUN ital-3222 117 89 ( ( PUNCT ital-3222 117 90 in in ADP ital-3222 117 91 seconds second NOUN ital-3222 117 92 ) ) PUNCT ital-3222 117 93 for for ADP ital-3222 117 94 the the DET ital-3222 117 95 whole whole ADJ ital-3222 117 96 test test NOUN ital-3222 117 97 corpus corpus NOUN ital-3222 117 98 , , PUNCT ital-3222 117 99 measured measure VERB ital-3222 117 100 on on ADP ital-3222 117 101 the the DET ital-3222 117 102 test test NOUN ital-3222 117 103 platform platform NOUN ital-3222 117 104 ( ( PUNCT ital-3222 117 105 they they PRON ital-3222 117 106 are be AUX ital-3222 117 107 total total ADJ ital-3222 117 108 elapsed elapse VERB ital-3222 117 109 times time NOUN ital-3222 117 110 including include VERB ital-3222 117 111 program program NOUN ital-3222 117 112 initialization initialization NOUN ital-3222 117 113 and and CCONJ ital-3222 117 114 disk disk NOUN ital-3222 117 115 operations operation NOUN ital-3222 117 116 ) ) PUNCT ital-3222 117 117 . . PUNCT ital-3222 118 1 bitrates bitrates AUX ital-3222 118 2 are be AUX ital-3222 118 3 given give VERB ital-3222 118 4 in in ADP ital-3222 118 5 output output NOUN ital-3222 118 6 bits bit NOUN ital-3222 118 7 per per ADP ital-3222 118 8 character character NOUN ital-3222 118 9 of of ADP ital-3222 118 10 an an DET ital-3222 118 11 uncompressed uncompressed ADJ ital-3222 118 12 document document NOUN ital-3222 118 13 in in ADP ital-3222 118 14 a a DET ital-3222 118 15 given give VERB ital-3222 118 16 format format NOUN ital-3222 118 17 , , PUNCT ital-3222 118 18 so so CCONJ ital-3222 118 19 a a DET ital-3222 118 20 smaller small ADJ ital-3222 118 21 table table NOUN ital-3222 118 22 3 3 NUM ital-3222 118 23 . . NOUN ital-3222 118 24 compression compression NOUN ital-3222 118 25 efficiency efficiency NOUN ital-3222 118 26 and and CCONJ ital-3222 118 27 times time NOUN ital-3222 118 28 for for ADP ital-3222 118 29 the the DET ital-3222 118 30 txt txt ADJ ital-3222 118 31 documents document NOUN ital-3222 118 32 deflate deflate VERB ital-3222 118 33 lzma lzma PROPN ital-3222 118 34 bzip2 bzip2 NOUN ital-3222 118 35 ppmvc ppmvc NOUN ital-3222 118 36 file file NOUN ital-3222 118 37 name name NOUN ital-3222 118 38 gzip gzip PROPN ital-3222 118 39 ctdl ctdl PROPN ital-3222 118 40 ctdl+ ctdl+ X ital-3222 118 41 7 7 NUM ital-3222 118 42 - - PUNCT ital-3222 118 43 zip zip NOUN ital-3222 118 44 ctdl ctdl NOUN ital-3222 118 45 ctdl+ ctdl+ X ital-3222 118 46 13601 13601 NUM ital-3222 118 47 - - PUNCT ital-3222 118 48 t t NOUN ital-3222 118 49 2.944 2.944 NUM ital-3222 118 50 2.244 2.244 NUM ital-3222 118 51 2.101 2.101 NUM ital-3222 118 52 2.337 2.337 NUM ital-3222 118 53 2.057 2.057 NUM ital-3222 118 54 1.919 1.919 NUM ital-3222 118 55 2.158 2.158 NUM ital-3222 118 56 1.863 1.863 NUM ital-3222 118 57 16514 16514 NUM ital-3222 118 58 - - PUNCT ital-3222 118 59 t t NOUN ital-3222 118 60 2.566 2.566 NUM ital-3222 118 61 2.150 2.150 NUM ital-3222 118 62 1.969 1.969 NUM ital-3222 118 63 2.228 2.228 NUM ital-3222 118 64 1.993 1.993 NUM ital-3222 118 65 1.838 1.838 NUM ital-3222 118 66 2.010 2.010 NUM ital-3222 118 67 1.780 1.780 NUM ital-3222 118 68 1noam10 1noam10 NUM ital-3222 118 69 t t NOUN ital-3222 118 70 2.967 2.967 NUM ital-3222 118 71 2.337 2.337 NUM ital-3222 118 72 2.109 2.109 NUM ital-3222 118 73 2.432 2.432 NUM ital-3222 118 74 2.151 2.151 NUM ital-3222 118 75 1.958 1.958 NUM ital-3222 118 76 2.160 2.160 NUM ital-3222 118 77 1.946 1.946 NUM ital-3222 118 78 2ws2610 2ws2610 NUM ital-3222 118 79 3.217 3.217 NUM ital-3222 118 80 2.874 2.874 NUM ital-3222 118 81 2.459 2.459 NUM ital-3222 118 82 2.871 2.871 NUM ital-3222 118 83 2.659 2.659 NUM ital-3222 118 84 2.312 2.312 NUM ital-3222 118 85 2.565 2.565 NUM ital-3222 118 86 2.343 2.343 NUM ital-3222 118 87 alice30 alice30 NOUN ital-3222 118 88 2.906 2.906 NUM ital-3222 118 89 2.533 2.533 NUM ital-3222 118 90 2.184 2.184 NUM ital-3222 118 91 2.585 2.585 NUM ital-3222 118 92 2.360 2.360 NUM ital-3222 118 93 2.056 2.056 NUM ital-3222 118 94 2.341 2.341 NUM ital-3222 118 95 2.090 2.090 NUM ital-3222 118 96 cdscs10 cdscs10 ADJ ital-3222 118 97 t t NOUN ital-3222 118 98 3.222 3.222 NUM ital-3222 118 99 2.898 2.898 NUM ital-3222 118 100 2.298 2.298 NUM ital-3222 118 101 2.928 2.928 NUM ital-3222 118 102 2.721 2.721 NUM ital-3222 118 103 2.192 2.192 NUM ital-3222 118 104 2.694 2.694 NUM ital-3222 118 105 2.436 2.436 NUM ital-3222 118 106 grimm10 grimm10 NOUN ital-3222 118 107 t t NOUN ital-3222 118 108 2.832 2.832 NUM ital-3222 118 109 2.275 2.275 NUM ital-3222 118 110 2.090 2.090 NUM ital-3222 118 111 2.357 2.357 NUM ital-3222 118 112 2.079 2.079 NUM ital-3222 118 113 1.931 1.931 NUM ital-3222 118 114 2.112 2.112 NUM ital-3222 118 115 1.886 1.886 NUM ital-3222 118 116 pandp12 pandp12 NOUN ital-3222 118 117 t t NOUN ital-3222 118 118 2.901 2.901 NUM ital-3222 118 119 2.251 2.251 NUM ital-3222 118 120 2.097 2.097 NUM ital-3222 118 121 2.366 2.366 NUM ital-3222 118 122 2.061 2.061 NUM ital-3222 118 123 1.930 1.930 NUM ital-3222 118 124 2.032 2.032 NUM ital-3222 118 125 1.835 1.835 NUM ital-3222 118 126 average average ADJ ital-3222 118 127 2.944 2.944 NUM ital-3222 118 128 2.445 2.445 NUM ital-3222 118 129 2.163 2.163 NUM ital-3222 118 130 2.513 2.513 NUM ital-3222 118 131 2.260 2.260 NUM ital-3222 118 132 2.017 2.017 NUM ital-3222 118 133 2.259 2.259 NUM ital-3222 118 134 2.022 2.022 NUM ital-3222 118 135 comp comp NOUN ital-3222 118 136 . . PUNCT ital-3222 119 1 time time NOUN ital-3222 119 2 0.688 0.688 NUM ital-3222 119 3 1.234 1.234 NUM ital-3222 119 4 0.954 0.954 NUM ital-3222 119 5 6.688 6.688 NUM ital-3222 119 6 2.640 2.640 NUM ital-3222 119 7 2.281 2.281 NUM ital-3222 119 8 2.110 2.110 NUM ital-3222 119 9 3.281 3.281 NUM ital-3222 119 10 dec dec PROPN ital-3222 119 11 . . PROPN ital-3222 119 12 time time NOUN ital-3222 119 13 0.125 0.125 NUM ital-3222 119 14 0.454 0.454 NUM ital-3222 119 15 0.546 0.546 NUM ital-3222 119 16 0.343 0.343 NUM ital-3222 119 17 0.610 0.610 NUM ital-3222 119 18 0.656 0.656 NUM ital-3222 119 19 0.703 0.703 NUM ital-3222 119 20 3.453 3.453 NUM ital-3222 119 21 table table NOUN ital-3222 119 22 4 4 NUM ital-3222 119 23 . . NOUN ital-3222 119 24 compression compression NOUN ital-3222 119 25 efficiency efficiency NOUN ital-3222 119 26 and and CCONJ ital-3222 119 27 times time NOUN ital-3222 119 28 for for ADP ital-3222 119 29 the the DET ital-3222 119 30 tex tex PROPN ital-3222 119 31 documents document NOUN ital-3222 119 32 deflate deflate VERB ital-3222 119 33 lzma lzma PROPN ital-3222 119 34 bzip2 bzip2 NOUN ital-3222 119 35 ppmvc ppmvc NOUN ital-3222 119 36 file file NOUN ital-3222 119 37 name name NOUN ital-3222 119 38 gzip gzip PROPN ital-3222 119 39 ctdl ctdl PROPN ital-3222 119 40 ctdl+ ctdl+ X ital-3222 120 1 7 7 NUM ital-3222 120 2 - - PUNCT ital-3222 120 3 zip zip NOUN ital-3222 120 4 ctdl ctdl NOUN ital-3222 120 5 ctdl+ ctdl+ X ital-3222 120 6 13601 13601 NUM ital-3222 120 7 - - PUNCT ital-3222 120 8 t t NOUN ital-3222 120 9 2.927 2.927 NUM ital-3222 120 10 2.233 2.233 NUM ital-3222 120 11 2.092 2.092 NUM ital-3222 120 12 2.328 2.328 NUM ital-3222 120 13 2.049 2.049 NUM ital-3222 120 14 1.913 1.913 NUM ital-3222 120 15 2.146 2.146 NUM ital-3222 120 16 1.852 1.852 NUM ital-3222 120 17 16514 16514 NUM ital-3222 120 18 - - PUNCT ital-3222 120 19 t t NOUN ital-3222 120 20 2.277 2.277 NUM ital-3222 120 21 1.904 1.904 NUM ital-3222 120 22 1.794 1.794 NUM ital-3222 120 23 1.957 1.957 NUM ital-3222 120 24 1.744 1.744 NUM ital-3222 120 25 1.645 1.645 NUM ital-3222 120 26 1.746 1.746 NUM ital-3222 120 27 1.534 1.534 NUM ital-3222 120 28 1noam10 1noam10 NUM ital-3222 120 29 t t NOUN ital-3222 120 30 2.976 2.976 NUM ital-3222 120 31 2.370 2.370 NUM ital-3222 120 32 2.142 2.142 NUM ital-3222 120 33 2.445 2.445 NUM ital-3222 120 34 2.186 2.186 NUM ital-3222 120 35 1.986 1.986 NUM ital-3222 120 36 2.195 2.195 NUM ital-3222 120 37 1.976 1.976 NUM ital-3222 120 38 2ws2610 2ws2610 NUM ital-3222 120 39 3.206 3.206 NUM ital-3222 120 40 2.906 2.906 NUM ital-3222 120 41 2.482 2.482 NUM ital-3222 120 42 2.864 2.864 NUM ital-3222 120 43 2.674 2.674 NUM ital-3222 120 44 2.323 2.323 NUM ital-3222 120 45 2.562 2.562 NUM ital-3222 120 46 2.340 2.340 NUM ital-3222 120 47 alice30 alice30 NOUN ital-3222 120 48 2.897 2.897 NUM ital-3222 120 49 2.526 2.526 NUM ital-3222 120 50 2.183 2.183 NUM ital-3222 120 51 2.573 2.573 NUM ital-3222 120 52 2.350 2.350 NUM ital-3222 120 53 2.048 2.048 NUM ital-3222 120 54 2.332 2.332 NUM ital-3222 120 55 2.085 2.085 NUM ital-3222 120 56 cdscs10 cdscs10 ADJ ital-3222 120 57 t t NOUN ital-3222 120 58 3.224 3.224 NUM ital-3222 120 59 2.931 2.931 NUM ital-3222 120 60 2.328 2.328 NUM ital-3222 120 61 2.941 2.941 NUM ital-3222 120 62 2.759 2.759 NUM ital-3222 120 63 2.222 2.222 NUM ital-3222 120 64 2.723 2.723 NUM ital-3222 120 65 2.466 2.466 NUM ital-3222 120 66 grimm10 grimm10 NOUN ital-3222 120 67 t t NOUN ital-3222 120 68 2.831 2.831 NUM ital-3222 120 69 2.304 2.304 NUM ital-3222 120 70 2.120 2.120 NUM ital-3222 120 71 2.364 2.364 NUM ital-3222 120 72 2.113 2.113 NUM ital-3222 120 73 1.960 1.960 NUM ital-3222 120 74 2.143 2.143 NUM ital-3222 120 75 1.910 1.910 NUM ital-3222 120 76 pandp12 pandp12 NOUN ital-3222 120 77 t t NOUN ital-3222 120 78 2.881 2.881 NUM ital-3222 120 79 2.239 2.239 NUM ital-3222 120 80 2.090 2.090 NUM ital-3222 120 81 2.346 2.346 NUM ital-3222 120 82 2.049 2.049 NUM ital-3222 120 83 1.916 1.916 NUM ital-3222 120 84 2.013 2.013 NUM ital-3222 120 85 1.817 1.817 NUM ital-3222 120 86 average average ADJ ital-3222 120 87 2.902 2.902 NUM ital-3222 120 88 2.427 2.427 NUM ital-3222 120 89 2.154 2.154 NUM ital-3222 120 90 2.477 2.477 NUM ital-3222 120 91 2.241 2.241 NUM ital-3222 120 92 2.002 2.002 NUM ital-3222 120 93 2.233 2.233 NUM ital-3222 120 94 1.998 1.998 NUM ital-3222 120 95 comp comp NOUN ital-3222 120 96 . . PUNCT ital-3222 121 1 time time NOUN ital-3222 121 2 0.688 0.688 NUM ital-3222 121 3 1.250 1.250 NUM ital-3222 121 4 0.969 0.969 NUM ital-3222 121 5 6.718 6.718 NUM ital-3222 121 6 2.703 2.703 NUM ital-3222 121 7 2.406 2.406 NUM ital-3222 121 8 2.140 2.140 NUM ital-3222 121 9 3.329 3.329 NUM ital-3222 121 10 dec dec PROPN ital-3222 121 11 . . PROPN ital-3222 121 12 time time PROPN ital-3222 121 13 0.109 0.109 NUM ital-3222 121 14 0.453 0.453 NUM ital-3222 121 15 0.547 0.547 NUM ital-3222 121 16 0.360 0.360 NUM ital-3222 121 17 0.609 0.609 NUM ital-3222 121 18 0.672 0.672 NUM ital-3222 121 19 0.703 0.703 NUM ital-3222 121 20 3.485 3.485 NUM ital-3222 121 21 the the DET ital-3222 121 22 efficient efficient ADJ ital-3222 121 23 storage storage NOUN ital-3222 121 24 of of ADP ital-3222 121 25 text text NOUN ital-3222 121 26 documents document NOUN ital-3222 121 27 in in ADP ital-3222 121 28 digital digital ADJ ital-3222 121 29 libraries library NOUN ital-3222 121 30 | | NOUN ital-3222 121 31 skibiński skibiński NOUN ital-3222 121 32 and and CCONJ ital-3222 121 33 swacha swacha VERB ital-3222 121 34 149 149 NUM ital-3222 121 35 bitrate bitrate NOUN ital-3222 121 36 ( ( PUNCT ital-3222 121 37 of of ADP ital-3222 121 38 , , PUNCT ital-3222 121 39 e.g. e.g. ADV ital-3222 121 40 , , PUNCT ital-3222 121 41 rtf rtf VERB ital-3222 121 42 documents document NOUN ital-3222 121 43 compared compare VERB ital-3222 121 44 to to ADP ital-3222 121 45 the the DET ital-3222 121 46 plain plain ADJ ital-3222 121 47 text text NOUN ital-3222 121 48 ) ) PUNCT ital-3222 121 49 does do AUX ital-3222 121 50 not not PART ital-3222 121 51 mean mean VERB ital-3222 121 52 the the DET ital-3222 121 53 file file NOUN ital-3222 121 54 is be AUX ital-3222 121 55 smaller small ADJ ital-3222 121 56 , , PUNCT ital-3222 121 57 only only ADV ital-3222 121 58 that that SCONJ ital-3222 121 59 the the DET ital-3222 121 60 compression compression NOUN ital-3222 121 61 was be AUX ital-3222 121 62 better well ADJ ital-3222 121 63 . . PUNCT ital-3222 122 1 uncompressed uncompressed ADJ ital-3222 122 2 files file NOUN ital-3222 122 3 have have VERB ital-3222 122 4 a a DET ital-3222 122 5 bitrate bitrate NOUN ital-3222 122 6 of of ADP ital-3222 122 7 8 8 NUM ital-3222 122 8 bits bit NOUN ital-3222 122 9 per per ADP ital-3222 122 10 character character NOUN ital-3222 122 11 . . PUNCT ital-3222 123 1 looking look VERB ital-3222 123 2 at at ADP ital-3222 123 3 the the DET ital-3222 123 4 results result NOUN ital-3222 123 5 obtained obtain VERB ital-3222 123 6 for for ADP ital-3222 123 7 txt txt ADJ ital-3222 123 8 documents document NOUN ital-3222 123 9 ( ( PUNCT ital-3222 123 10 table table NOUN ital-3222 123 11 3 3 NUM ital-3222 123 12 ) ) PUNCT ital-3222 123 13 , , PUNCT ital-3222 123 14 we we PRON ital-3222 123 15 can can AUX ital-3222 123 16 see see VERB ital-3222 123 17 an an DET ital-3222 123 18 average average ADJ ital-3222 123 19 improvement improvement NOUN ital-3222 123 20 of of ADP ital-3222 123 21 17 17 NUM ital-3222 123 22 percent percent NOUN ital-3222 123 23 for for ADP ital-3222 123 24 ctdl ctdl NOUN ital-3222 123 25 and and CCONJ ital-3222 123 26 27 27 NUM ital-3222 123 27 percent percent NOUN ital-3222 123 28 for for ADP ital-3222 123 29 ctdl+ ctdl+ NOUN ital-3222 123 30 compared compare VERB ital-3222 123 31 to to ADP ital-3222 123 32 the the DET ital-3222 123 33 baseline baseline ADJ ital-3222 123 34 deflate deflate NOUN ital-3222 123 35 implementation implementation NOUN ital-3222 123 36 . . PUNCT ital-3222 124 1 compared compare VERB ital-3222 124 2 to to ADP ital-3222 124 3 the the DET ital-3222 124 4 baseline baseline PROPN ital-3222 124 5 lzma lzma PROPN ital-3222 124 6 implementation implementation NOUN ital-3222 124 7 , , PUNCT ital-3222 124 8 the the DET ital-3222 124 9 improvement improvement NOUN ital-3222 124 10 is be AUX ital-3222 124 11 10 10 NUM ital-3222 124 12 percent percent NOUN ital-3222 124 13 for for ADP ital-3222 124 14 ctdl ctdl NOUN ital-3222 124 15 and and CCONJ ital-3222 124 16 20 20 NUM ital-3222 124 17 percent percent NOUN ital-3222 124 18 for for ADP ital-3222 124 19 ctdl+ ctdl+ PROPN ital-3222 124 20 . . PUNCT ital-3222 125 1 also also ADV ital-3222 125 2 , , PUNCT ital-3222 125 3 ctdl+ ctdl+ X ital-3222 125 4 combined combine VERB ital-3222 125 5 with with ADP ital-3222 125 6 lzma lzma PROPN ital-3222 125 7 compresses compress NOUN ital-3222 125 8 txt txt VERB ital-3222 125 9 documents document NOUN ital-3222 125 10 31 31 NUM ital-3222 125 11 percent percent NOUN ital-3222 125 12 better well ADJ ital-3222 125 13 than than ADP ital-3222 125 14 gzip gzip PROPN ital-3222 125 15 , , PUNCT ital-3222 125 16 11 11 NUM ital-3222 125 17 percent percent NOUN ital-3222 125 18 better well ADJ ital-3222 125 19 than than ADP ital-3222 125 20 bzip2 bzip2 NOUN ital-3222 125 21 , , PUNCT ital-3222 125 22 and and CCONJ ital-3222 125 23 slightly slightly ADV ital-3222 125 24 better well ADJ ital-3222 125 25 than than ADP ital-3222 125 26 the the DET ital-3222 125 27 state state NOUN ital-3222 125 28 - - PUNCT ital-3222 125 29 of of ADP ital-3222 125 30 - - PUNCT ital-3222 125 31 the the DET ital-3222 125 32 - - PUNCT ital-3222 125 33 art art NOUN ital-3222 125 34 ppmvc ppmvc NOUN ital-3222 125 35 implementation implementation NOUN ital-3222 125 36 . . PUNCT ital-3222 126 1 in in ADP ital-3222 126 2 case case NOUN ital-3222 126 3 of of ADP ital-3222 126 4 tex tex PROPN ital-3222 126 5 documents document NOUN ital-3222 126 6 ( ( PUNCT ital-3222 126 7 table table NOUN ital-3222 126 8 4 4 NUM ital-3222 126 9 ) ) PUNCT ital-3222 126 10 , , PUNCT ital-3222 126 11 the the DET ital-3222 126 12 gzip gzip NOUN ital-3222 126 13 results result NOUN ital-3222 126 14 were be AUX ital-3222 126 15 improved improve VERB ital-3222 126 16 , , PUNCT ital-3222 126 17 on on ADP ital-3222 126 18 average average ADJ ital-3222 126 19 , , PUNCT ital-3222 126 20 by by ADP ital-3222 126 21 16 16 NUM ital-3222 126 22 percent percent NOUN ital-3222 126 23 using use VERB ital-3222 126 24 ctdl ctdl NOUN ital-3222 126 25 and and CCONJ ital-3222 126 26 by by ADP ital-3222 126 27 26 26 NUM ital-3222 126 28 percent percent NOUN ital-3222 126 29 using use VERB ital-3222 126 30 ctdl+ ctdl+ PROPN ital-3222 126 31 ; ; PUNCT ital-3222 126 32 the the DET ital-3222 126 33 numbers number NOUN ital-3222 126 34 for for ADP ital-3222 126 35 lzma lzma PROPN ital-3222 126 36 are be AUX ital-3222 126 37 10 10 NUM ital-3222 126 38 percent percent NOUN ital-3222 126 39 for for ADP ital-3222 126 40 ctdl ctdl NOUN ital-3222 126 41 and and CCONJ ital-3222 126 42 19 19 NUM ital-3222 126 43 percent percent NOUN ital-3222 126 44 for for ADP ital-3222 126 45 ctdl+ ctdl+ PROPN ital-3222 126 46 . . PUNCT ital-3222 127 1 in in ADP ital-3222 127 2 a a DET ital-3222 127 3 cross cross ADJ ital-3222 127 4 - - ADJ ital-3222 127 5 method method ADJ ital-3222 127 6 comparison comparison NOUN ital-3222 127 7 , , PUNCT ital-3222 127 8 ctdl+ ctdl+ ADP ital-3222 127 9 with with ADP ital-3222 127 10 lzma lzma NOUN ital-3222 127 11 beats beat NOUN ital-3222 127 12 gzip gzip NOUN ital-3222 127 13 by by ADP ital-3222 127 14 31 31 NUM ital-3222 127 15 percent percent NOUN ital-3222 127 16 , , PUNCT ital-3222 127 17 bzip2 bzip2 NOUN ital-3222 127 18 by by ADP ital-3222 127 19 10 10 NUM ital-3222 127 20 percent percent NOUN ital-3222 127 21 , , PUNCT ital-3222 127 22 and and CCONJ ital-3222 127 23 attains attain NOUN ital-3222 127 24 results result VERB ital-3222 127 25 very very ADV ital-3222 127 26 close close ADJ ital-3222 127 27 to to ADP ital-3222 127 28 ppmvc ppmvc VERB ital-3222 127 29 . . PUNCT ital-3222 128 1 on on ADP ital-3222 128 2 average average ADJ ital-3222 128 3 , , PUNCT ital-3222 128 4 deflate deflate NOUN ital-3222 128 5 - - PUNCT ital-3222 128 6 based base VERB ital-3222 128 7 ctdl ctdl PROPN ital-3222 128 8 compressed compress VERB ital-3222 128 9 xml xml NOUN ital-3222 128 10 documents document NOUN ital-3222 128 11 20 20 NUM ital-3222 128 12 percent percent NOUN ital-3222 128 13 better well ADJ ital-3222 128 14 than than ADP ital-3222 128 15 the the DET ital-3222 128 16 baseline baseline NOUN ital-3222 128 17 algorithm algorithm NOUN ital-3222 128 18 ( ( PUNCT ital-3222 128 19 table table NOUN ital-3222 128 20 5 5 NUM ital-3222 128 21 ) ) PUNCT ital-3222 128 22 , , PUNCT ital-3222 128 23 and and CCONJ ital-3222 128 24 with with ADP ital-3222 128 25 ctdl+ ctdl+ X ital-3222 128 26 the the DET ital-3222 128 27 improvement improvement NOUN ital-3222 128 28 rises rise VERB ital-3222 128 29 to to ADP ital-3222 128 30 26 26 NUM ital-3222 128 31 percent percent NOUN ital-3222 128 32 . . PUNCT ital-3222 129 1 ctdl ctdl PROPN ital-3222 129 2 improves improve VERB ital-3222 129 3 lzma lzma ADJ ital-3222 129 4 compression compression NOUN ital-3222 129 5 by by ADP ital-3222 129 6 11 11 NUM ital-3222 129 7 percent percent NOUN ital-3222 129 8 , , PUNCT ital-3222 129 9 and and CCONJ ital-3222 129 10 ctdl+ ctdl+ PROPN ital-3222 129 11 improves improve VERB ital-3222 129 12 it it PRON ital-3222 129 13 by by ADP ital-3222 129 14 18 18 NUM ital-3222 129 15 percent percent NOUN ital-3222 129 16 . . PUNCT ital-3222 130 1 ctdl+ ctdl+ X ital-3222 130 2 with with ADP ital-3222 130 3 lzma lzma PROPN ital-3222 130 4 beats beat NOUN ital-3222 130 5 gzip gzip NOUN ital-3222 130 6 by by ADP ital-3222 130 7 33 33 NUM ital-3222 130 8 percent percent NOUN ital-3222 130 9 , , PUNCT ital-3222 130 10 bzip2 bzip2 NOUN ital-3222 130 11 by by ADP ital-3222 130 12 8 8 NUM ital-3222 130 13 percent percent NOUN ital-3222 130 14 , , PUNCT ital-3222 130 15 and and CCONJ ital-3222 130 16 loses lose VERB ital-3222 130 17 only only ADV ital-3222 130 18 4 4 NUM ital-3222 130 19 percent percent NOUN ital-3222 130 20 to to PART ital-3222 130 21 ppmvc ppmvc VERB ital-3222 130 22 . . PUNCT ital-3222 131 1 similar similar ADJ ital-3222 131 2 results result NOUN ital-3222 131 3 were be AUX ital-3222 131 4 obtained obtain VERB ital-3222 131 5 for for ADP ital-3222 131 6 html html NOUN ital-3222 131 7 documents document NOUN ital-3222 131 8 ( ( PUNCT ital-3222 131 9 table table NOUN ital-3222 131 10 6 6 NUM ital-3222 131 11 ): ): PUNCT ital-3222 131 12 they they PRON ital-3222 131 13 were be AUX ital-3222 131 14 compressed compress VERB ital-3222 131 15 with with ADP ital-3222 131 16 ctdl ctdl NOUN ital-3222 131 17 and and CCONJ ital-3222 131 18 deflate deflate VERB ital-3222 131 19 18 18 NUM ital-3222 131 20 percent percent NOUN ital-3222 131 21 better well ADJ ital-3222 131 22 than than ADP ital-3222 131 23 with with ADP ital-3222 131 24 the the DET ital-3222 131 25 deflate deflate ADJ ital-3222 131 26 algorithm algorithm NOUN ital-3222 131 27 alone alone ADV ital-3222 131 28 , , PUNCT ital-3222 131 29 and and CCONJ ital-3222 131 30 27 27 NUM ital-3222 131 31 percent percent NOUN ital-3222 131 32 better well ADV ital-3222 131 33 with with ADP ital-3222 131 34 ctdl+ ctdl+ PROPN ital-3222 131 35 . . PUNCT ital-3222 132 1 lzma lzma PROPN ital-3222 132 2 compression compression NOUN ital-3222 132 3 efficiency efficiency NOUN ital-3222 132 4 is be AUX ital-3222 132 5 improved improve VERB ital-3222 132 6 by by ADP ital-3222 132 7 11 11 NUM ital-3222 132 8 percent percent NOUN ital-3222 132 9 with with ADP ital-3222 132 10 ctdl ctdl NOUN ital-3222 132 11 and and CCONJ ital-3222 132 12 20 20 NUM ital-3222 132 13 percent percent NOUN ital-3222 132 14 with with ADP ital-3222 132 15 ctdl+ ctdl+ PROPN ital-3222 132 16 . . PUNCT ital-3222 133 1 ctdl+ ctdl+ X ital-3222 133 2 with with ADP ital-3222 133 3 lzma lzma PROPN ital-3222 133 4 beats beat NOUN ital-3222 133 5 gzip gzip NOUN ital-3222 133 6 by by ADP ital-3222 133 7 33 33 NUM ital-3222 133 8 percent percent NOUN ital-3222 133 9 , , PUNCT ital-3222 133 10 bzip2 bzip2 NOUN ital-3222 133 11 by by ADP ital-3222 133 12 9 9 NUM ital-3222 133 13 percent percent NOUN ital-3222 133 14 , , PUNCT ital-3222 133 15 and and CCONJ ital-3222 133 16 loses lose VERB ital-3222 133 17 only only ADV ital-3222 133 18 2 2 NUM ital-3222 133 19 percent percent NOUN ital-3222 133 20 to to PART ital-3222 133 21 ppmvc ppmvc VERB ital-3222 133 22 . . PUNCT ital-3222 134 1 for for ADP ital-3222 134 2 rtf rtf ADJ ital-3222 134 3 documents document NOUN ital-3222 134 4 ( ( PUNCT ital-3222 134 5 table table NOUN ital-3222 134 6 7 7 NUM ital-3222 134 7 ) ) PUNCT ital-3222 134 8 , , PUNCT ital-3222 134 9 the the DET ital-3222 134 10 gzip gzip NOUN ital-3222 134 11 results result NOUN ital-3222 134 12 were be AUX ital-3222 134 13 improved improve VERB ital-3222 134 14 , , PUNCT ital-3222 134 15 on on ADP ital-3222 134 16 average average ADJ ital-3222 134 17 , , PUNCT ital-3222 134 18 by by ADP ital-3222 134 19 18 18 NUM ital-3222 134 20 percent percent NOUN ital-3222 134 21 using use VERB ital-3222 134 22 ctdl ctdl NOUN ital-3222 134 23 , , PUNCT ital-3222 134 24 and and CCONJ ital-3222 134 25 25 25 NUM ital-3222 134 26 percent percent NOUN ital-3222 134 27 using use VERB ital-3222 134 28 ctdl+ ctdl+ PROPN ital-3222 134 29 ; ; PUNCT ital-3222 134 30 the the DET ital-3222 134 31 numbers number NOUN ital-3222 134 32 for for ADP ital-3222 134 33 lzma lzma NOUN ital-3222 134 34 are be AUX ital-3222 134 35 respectively respectively ADV ital-3222 134 36 9 9 NUM ital-3222 134 37 percent percent NOUN ital-3222 134 38 for for ADP ital-3222 134 39 ctdl ctdl NOUN ital-3222 134 40 and and CCONJ ital-3222 134 41 17 17 NUM ital-3222 134 42 percent percent NOUN ital-3222 134 43 for for ADP ital-3222 134 44 ctdl+ ctdl+ PROPN ital-3222 134 45 . . PUNCT ital-3222 135 1 in in ADP ital-3222 135 2 a a DET ital-3222 135 3 cross cross ADJ ital-3222 135 4 - - ADJ ital-3222 135 5 method method ADJ ital-3222 135 6 comparison comparison NOUN ital-3222 135 7 , , PUNCT ital-3222 135 8 ctdl+ ctdl+ ADP ital-3222 135 9 with with ADP ital-3222 135 10 lzma lzma NOUN ital-3222 135 11 beats beat NOUN ital-3222 135 12 gzip gzip NOUN ital-3222 135 13 by by ADP ital-3222 135 14 34 34 NUM ital-3222 135 15 percent percent NOUN ital-3222 135 16 , , PUNCT ital-3222 135 17 bzip2 bzip2 NOUN ital-3222 135 18 by by ADP ital-3222 135 19 7 7 NUM ital-3222 135 20 percent percent NOUN ital-3222 135 21 , , PUNCT ital-3222 135 22 and and CCONJ ital-3222 135 23 loses lose VERB ital-3222 135 24 5 5 NUM ital-3222 135 25 percent percent NOUN ital-3222 135 26 to to PART ital-3222 135 27 ppmvc ppmvc VERB ital-3222 135 28 . . PUNCT ital-3222 136 1 although although SCONJ ital-3222 136 2 there there PRON ital-3222 136 3 is be VERB ital-3222 136 4 no no DET ital-3222 136 5 mode mode NOUN ital-3222 136 6 designed design VERB ital-3222 136 7 especially especially ADV ital-3222 136 8 for for ADP ital-3222 136 9 doc doc PROPN ital-3222 136 10 documents document NOUN ital-3222 136 11 in in ADP ital-3222 136 12 ctdl ctdl NOUN ital-3222 136 13 ( ( PUNCT ital-3222 136 14 table table NOUN ital-3222 136 15 8) 8) NUM ital-3222 136 16 , , PUNCT ital-3222 136 17 the the DET ital-3222 136 18 basic basic ADJ ital-3222 136 19 txt txt ADJ ital-3222 136 20 mode mode NOUN ital-3222 136 21 was be AUX ital-3222 136 22 used use VERB ital-3222 136 23 , , PUNCT ital-3222 136 24 as as SCONJ ital-3222 136 25 it it PRON ital-3222 136 26 was be AUX ital-3222 136 27 found find VERB ital-3222 136 28 experimentally experimentally ADV ital-3222 136 29 to to PART ital-3222 136 30 be be AUX ital-3222 136 31 the the DET ital-3222 136 32 best good ADJ ital-3222 136 33 choice choice NOUN ital-3222 136 34 available available ADJ ital-3222 136 35 . . PUNCT ital-3222 137 1 the the DET ital-3222 137 2 results result NOUN ital-3222 137 3 show show VERB ital-3222 137 4 it it PRON ital-3222 137 5 managed manage VERB ital-3222 137 6 to to PART ital-3222 137 7 improve improve VERB ital-3222 137 8 deflate deflate NOUN ital-3222 137 9 - - PUNCT ital-3222 137 10 based base VERB ital-3222 137 11 compression compression NOUN ital-3222 137 12 by by ADP ital-3222 137 13 9 9 NUM ital-3222 137 14 percent percent NOUN ital-3222 137 15 using use VERB ital-3222 137 16 ctdl ctdl NOUN ital-3222 137 17 , , PUNCT ital-3222 137 18 and and CCONJ ital-3222 137 19 by by ADP ital-3222 137 20 21 21 NUM ital-3222 137 21 percent percent NOUN ital-3222 137 22 using use VERB ital-3222 137 23 ctdl+ ctdl+ PROPN ital-3222 137 24 , , PUNCT ital-3222 137 25 whereas whereas SCONJ ital-3222 137 26 lzma lzma PROPN ital-3222 137 27 - - PUNCT ital-3222 137 28 based base VERB ital-3222 137 29 compression compression NOUN ital-3222 137 30 was be AUX ital-3222 137 31 improved improve VERB ital-3222 137 32 respectively respectively ADV ital-3222 137 33 by by ADP ital-3222 137 34 4 4 NUM ital-3222 137 35 percent percent NOUN ital-3222 137 36 for for ADP ital-3222 137 37 ctdl ctdl NOUN ital-3222 137 38 and and CCONJ ital-3222 137 39 14 14 NUM ital-3222 137 40 percent percent NOUN ital-3222 137 41 for for ADP ital-3222 137 42 ctdl+ ctdl+ PROPN ital-3222 137 43 . . PUNCT ital-3222 138 1 combined combine VERB ital-3222 138 2 with with ADP ital-3222 138 3 lzma lzma PROPN ital-3222 138 4 , , PUNCT ital-3222 138 5 ctdl+ ctdl+ PROPN ital-3222 138 6 compresses compress VERB ital-3222 138 7 doc doc PROPN ital-3222 138 8 documents document NOUN ital-3222 138 9 30 30 NUM ital-3222 138 10 percent percent NOUN ital-3222 138 11 better well ADJ ital-3222 138 12 than than ADP ital-3222 138 13 gzip gzip PROPN ital-3222 138 14 , , PUNCT ital-3222 138 15 13 13 NUM ital-3222 138 16 percent percent NOUN ital-3222 138 17 better well ADJ ital-3222 138 18 than than ADP ital-3222 138 19 bzip2 bzip2 NOUN ital-3222 138 20 , , PUNCT ital-3222 138 21 and and CCONJ ital-3222 138 22 1 1 NUM ital-3222 138 23 percent percent NOUN ital-3222 138 24 better well ADJ ital-3222 138 25 than than ADP ital-3222 138 26 ppmvc ppmvc NOUN ital-3222 138 27 . . PUNCT ital-3222 139 1 in in ADP ital-3222 139 2 case case NOUN ital-3222 139 3 of of ADP ital-3222 139 4 ps ps PROPN ital-3222 139 5 documents document NOUN ital-3222 139 6 ( ( PUNCT ital-3222 139 7 table table NOUN ital-3222 139 8 9 9 NUM ital-3222 139 9 ) ) PUNCT ital-3222 139 10 , , PUNCT ital-3222 139 11 the the DET ital-3222 139 12 gzip gzip NOUN ital-3222 139 13 results result NOUN ital-3222 139 14 were be AUX ital-3222 139 15 improved improve VERB ital-3222 139 16 , , PUNCT ital-3222 139 17 on on ADP ital-3222 139 18 average average ADJ ital-3222 139 19 , , PUNCT ital-3222 139 20 by by ADP ital-3222 139 21 5 5 NUM ital-3222 139 22 percent percent NOUN ital-3222 139 23 using use VERB ital-3222 139 24 ctdl ctdl NOUN ital-3222 139 25 , , PUNCT ital-3222 139 26 and and CCONJ ital-3222 139 27 by by ADP ital-3222 139 28 8 8 NUM ital-3222 139 29 percent percent NOUN ital-3222 139 30 using use VERB ital-3222 139 31 ctdl+ ctdl+ PROPN ital-3222 139 32 ; ; PUNCT ital-3222 139 33 the the DET ital-3222 139 34 numbers number NOUN ital-3222 139 35 for for ADP ital-3222 139 36 lzma lzma PROPN ital-3222 139 37 improved improve VERB ital-3222 139 38 3 3 NUM ital-3222 139 39 percent percent NOUN ital-3222 139 40 for for ADP ital-3222 139 41 ctdl ctdl NOUN ital-3222 139 42 and and CCONJ ital-3222 139 43 5 5 NUM ital-3222 139 44 percent percent NOUN ital-3222 139 45 for for ADP ital-3222 139 46 ctdl+ ctdl+ PROPN ital-3222 139 47 . . PUNCT ital-3222 140 1 in in ADP ital-3222 140 2 a a DET ital-3222 140 3 cross cross ADJ ital-3222 140 4 - - ADJ ital-3222 140 5 method method ADJ ital-3222 140 6 comparison comparison NOUN ital-3222 140 7 , , PUNCT ital-3222 140 8 ctdl+ ctdl+ ADP ital-3222 140 9 with with ADP ital-3222 140 10 lzma lzma PROPN ital-3222 140 11 beats beat NOUN ital-3222 140 12 gzip gzip NOUN ital-3222 140 13 by by ADP ital-3222 140 14 8 8 NUM ital-3222 140 15 percent percent NOUN ital-3222 140 16 , , PUNCT ital-3222 140 17 losing lose VERB ital-3222 140 18 5 5 NUM ital-3222 140 19 percent percent NOUN ital-3222 140 20 to to ADP ital-3222 140 21 bzip2 bzip2 NOUN ital-3222 140 22 and and CCONJ ital-3222 140 23 7 7 NUM ital-3222 140 24 percent percent NOUN ital-3222 140 25 to to PART ital-3222 140 26 ppmvc ppmvc NOUN ital-3222 140 27 . . PUNCT ital-3222 141 1 finally finally ADV ital-3222 141 2 , , PUNCT ital-3222 141 3 ctdl ctdl PROPN ital-3222 141 4 improved improve VERB ital-3222 141 5 deflate deflate NOUN ital-3222 141 6 - - PUNCT ital-3222 141 7 based base VERB ital-3222 141 8 compression compression NOUN ital-3222 141 9 of of ADP ital-3222 141 10 pdf pdf NOUN ital-3222 141 11 documents document NOUN ital-3222 141 12 ( ( PUNCT ital-3222 141 13 table table NOUN ital-3222 141 14 10 10 NUM ital-3222 141 15 ) ) PUNCT ital-3222 141 16 by by ADP ital-3222 141 17 9 9 NUM ital-3222 141 18 percent percent NOUN ital-3222 141 19 using use VERB ital-3222 141 20 ctdl ctdl NOUN ital-3222 141 21 and and CCONJ ital-3222 141 22 10 10 NUM ital-3222 141 23 percent percent NOUN ital-3222 141 24 using use VERB ital-3222 141 25 ctdl+ ctdl+ PROPN ital-3222 141 26 ( ( PUNCT ital-3222 141 27 compared compare VERB ital-3222 141 28 to to ADP ital-3222 141 29 gzip gzip NOUN ital-3222 141 30 ; ; PUNCT ital-3222 141 31 the the DET ital-3222 141 32 numbers number NOUN ital-3222 141 33 are be AUX ital-3222 141 34 table table NOUN ital-3222 141 35 5 5 NUM ital-3222 141 36 . . NUM ital-3222 141 37 compression compression NOUN ital-3222 141 38 efficiency efficiency NOUN ital-3222 141 39 and and CCONJ ital-3222 141 40 times time NOUN ital-3222 141 41 for for ADP ital-3222 141 42 the the DET ital-3222 141 43 xml xml NOUN ital-3222 141 44 documents document NOUN ital-3222 141 45 deflate deflate VERB ital-3222 141 46 lzma lzma PROPN ital-3222 141 47 bzip2 bzip2 NOUN ital-3222 141 48 ppmvc ppmvc NOUN ital-3222 141 49 file file NOUN ital-3222 141 50 name name NOUN ital-3222 141 51 gzip gzip PROPN ital-3222 141 52 ctdl ctdl PROPN ital-3222 141 53 ctdl+ ctdl+ X ital-3222 141 54 7 7 NUM ital-3222 141 55 - - PUNCT ital-3222 141 56 zip zip NOUN ital-3222 141 57 ctdl ctdl NOUN ital-3222 141 58 ctdl+ ctdl+ X ital-3222 141 59 13601 13601 NUM ital-3222 141 60 - - PUNCT ital-3222 141 61 t t NOUN ital-3222 141 62 2.046 2.046 NUM ital-3222 141 63 1.551 1.551 NUM ital-3222 141 64 1.514 1.514 NUM ital-3222 141 65 1.585 1.585 NUM ital-3222 141 66 1.405 1.405 NUM ital-3222 141 67 1.339 1.339 NUM ital-3222 141 68 1.451 1.451 NUM ital-3222 141 69 1.242 1.242 NUM ital-3222 142 1 16514 16514 NUM ital-3222 142 2 - - PUNCT ital-3222 142 3 t t NOUN ital-3222 142 4 0.871 0.871 NUM ital-3222 142 5 0.698 0.698 NUM ital-3222 142 6 0.670 0.670 NUM ital-3222 142 7 0.703 0.703 NUM ital-3222 142 8 0.612 0.612 NUM ital-3222 142 9 0.590 0.590 NUM ital-3222 142 10 0.599 0.599 NUM ital-3222 142 11 0.552 0.552 NUM ital-3222 142 12 1noam10 1noam10 NUM ital-3222 142 13 t t NOUN ital-3222 142 14 2.383 2.383 NUM ital-3222 142 15 1.870 1.870 NUM ital-3222 142 16 1.736 1.736 NUM ital-3222 142 17 1.914 1.914 NUM ital-3222 142 18 1.711 1.711 NUM ital-3222 142 19 1.575 1.575 NUM ital-3222 142 20 1.724 1.724 NUM ital-3222 142 21 1.515 1.515 NUM ital-3222 142 22 2ws2610 2ws2610 NUM ital-3222 142 23 0.691 0.691 NUM ital-3222 142 24 0.539 0.539 NUM ital-3222 142 25 0.497 0.497 NUM ital-3222 142 26 0.561 0.561 NUM ital-3222 142 27 0.474 0.474 NUM ital-3222 142 28 0.440 0.440 NUM ital-3222 143 1 0.461 0.461 NUM ital-3222 143 2 0.422 0.422 NUM ital-3222 143 3 alice30 alice30 NOUN ital-3222 143 4 1.477 1.477 NUM ital-3222 143 5 1.258 1.258 NUM ital-3222 143 6 1.140 1.140 NUM ital-3222 143 7 1.248 1.248 NUM ital-3222 143 8 1.131 1.131 NUM ital-3222 143 9 1.034 1.034 NUM ital-3222 143 10 1.116 1.116 NUM ital-3222 143 11 0.999 0.999 NUM ital-3222 143 12 cdscs10 cdscs10 NOUN ital-3222 143 13 t t NOUN ital-3222 143 14 2.106 2.106 NUM ital-3222 143 15 1.892 1.892 NUM ital-3222 143 16 1.576 1.576 NUM ital-3222 143 17 1.862 1.862 NUM ital-3222 143 18 1.741 1.741 NUM ital-3222 143 19 1.462 1.462 NUM ital-3222 143 20 1.721 1.721 NUM ital-3222 143 21 1.538 1.538 NUM ital-3222 143 22 grimm10 grimm10 NOUN ital-3222 143 23 t t NOUN ital-3222 143 24 1.878 1.878 NUM ital-3222 143 25 1.485 1.485 NUM ital-3222 143 26 1.422 1.422 NUM ital-3222 143 27 1.521 1.521 NUM ital-3222 143 28 1.337 1.337 NUM ital-3222 143 29 1.276 1.276 NUM ital-3222 143 30 1.337 1.337 NUM ital-3222 143 31 1.198 1.198 NUM ital-3222 143 32 pandp12 pandp12 NOUN ital-3222 143 33 t t NOUN ital-3222 143 34 1.875 1.875 NUM ital-3222 143 35 1.404 1.404 NUM ital-3222 143 36 1.349 1.349 NUM ital-3222 143 37 1.465 1.465 NUM ital-3222 143 38 1.263 1.263 NUM ital-3222 143 39 1.207 1.207 NUM ital-3222 143 40 1.252 1.252 NUM ital-3222 143 41 1.105 1.105 NUM ital-3222 143 42 average average ADJ ital-3222 143 43 1.666 1.666 NUM ital-3222 143 44 1.337 1.337 NUM ital-3222 143 45 1.238 1.238 NUM ital-3222 143 46 1.357 1.357 NUM ital-3222 143 47 1.209 1.209 NUM ital-3222 143 48 1.115 1.115 NUM ital-3222 143 49 1.208 1.208 NUM ital-3222 143 50 1.071 1.071 NUM ital-3222 143 51 comp comp NOUN ital-3222 143 52 . . PUNCT ital-3222 144 1 time time NOUN ital-3222 144 2 0.750 0.750 NUM ital-3222 144 3 1.844 1.844 NUM ital-3222 144 4 1.390 1.390 NUM ital-3222 144 5 10.79 10.79 NUM ital-3222 144 6 4.891 4.891 NUM ital-3222 144 7 5.828 5.828 NUM ital-3222 144 8 7.047 7.047 NUM ital-3222 144 9 3.688 3.688 NUM ital-3222 144 10 dec dec PROPN ital-3222 144 11 . . PROPN ital-3222 144 12 time time NOUN ital-3222 144 13 0.141 0.141 NUM ital-3222 144 14 0.672 0.672 NUM ital-3222 144 15 0.750 0.750 NUM ital-3222 144 16 0.421 0.421 NUM ital-3222 144 17 0.859 0.859 NUM ital-3222 144 18 0.953 0.953 NUM ital-3222 144 19 1.140 1.140 NUM ital-3222 144 20 3.907 3.907 NUM ital-3222 144 21 150 150 NUM ital-3222 144 22 information information NOUN ital-3222 144 23 technology technology NOUN ital-3222 144 24 and and CCONJ ital-3222 144 25 libraries library NOUN ital-3222 144 26 | | NOUN ital-3222 144 27 september september PROPN ital-3222 144 28 2009 2009 NUM ital-3222 144 29 much much ADV ital-3222 144 30 higher high ADJ ital-3222 144 31 if if SCONJ ital-3222 144 32 compared compare VERB ital-3222 144 33 to to ADP ital-3222 144 34 the the DET ital-3222 144 35 embedded embed VERB ital-3222 144 36 pdf pdf NOUN ital-3222 144 37 compression compression NOUN ital-3222 144 38 — — PUNCT ital-3222 144 39 see see VERB ital-3222 144 40 “ " PUNCT ital-3222 144 41 native native ADJ ital-3222 144 42 ” " PUNCT ital-3222 144 43 column column NOUN ital-3222 144 44 in in ADP ital-3222 144 45 table table NOUN ital-3222 144 46 10 10 NUM ital-3222 144 47 ) ) PUNCT ital-3222 144 48 ; ; PUNCT ital-3222 144 49 the the DET ital-3222 144 50 numbers number NOUN ital-3222 144 51 for for ADP ital-3222 144 52 lzma lzma NOUN ital-3222 144 53 are be AUX ital-3222 144 54 respectively respectively ADV ital-3222 144 55 7 7 NUM ital-3222 144 56 percent percent NOUN ital-3222 144 57 for for ADP ital-3222 144 58 ctdl ctdl NOUN ital-3222 144 59 and and CCONJ ital-3222 144 60 10 10 NUM ital-3222 144 61 percent percent NOUN ital-3222 144 62 for for ADP ital-3222 144 63 ctdl+ ctdl+ PROPN ital-3222 144 64 . . PUNCT ital-3222 145 1 combined combine VERB ital-3222 145 2 with with ADP ital-3222 145 3 lzma lzma PROPN ital-3222 145 4 , , PUNCT ital-3222 145 5 ctdl+ ctdl+ PROPN ital-3222 145 6 compresses compress NOUN ital-3222 145 7 pdf pdf VERB ital-3222 145 8 documents document NOUN ital-3222 145 9 28 28 NUM ital-3222 145 10 percent percent NOUN ital-3222 145 11 better well ADJ ital-3222 145 12 than than ADP ital-3222 145 13 gzip gzip PROPN ital-3222 145 14 , , PUNCT ital-3222 145 15 4 4 NUM ital-3222 145 16 percent percent NOUN ital-3222 145 17 better well ADJ ital-3222 145 18 than than ADP ital-3222 145 19 bzip2 bzip2 NOUN ital-3222 145 20 , , PUNCT ital-3222 145 21 and and CCONJ ital-3222 145 22 5 5 NUM ital-3222 145 23 percent percent NOUN ital-3222 145 24 worse bad ADJ ital-3222 145 25 than than ADP ital-3222 145 26 ppmvc ppmvc NOUN ital-3222 145 27 . . PUNCT ital-3222 146 1 the the DET ital-3222 146 2 results result NOUN ital-3222 146 3 presented present VERB ital-3222 146 4 in in ADP ital-3222 146 5 tables table NOUN ital-3222 146 6 3–10 3–10 NUM ital-3222 146 7 show show NOUN ital-3222 146 8 that that SCONJ ital-3222 146 9 ctdl ctdl PROPN ital-3222 146 10 manages manage VERB ital-3222 146 11 to to PART ital-3222 146 12 improve improve VERB ital-3222 146 13 compression compression NOUN ital-3222 146 14 efficiency efficiency NOUN ital-3222 146 15 of of ADP ital-3222 146 16 the the DET ital-3222 146 17 general general ADJ ital-3222 146 18 - - PUNCT ital-3222 146 19 purpose purpose NOUN ital-3222 146 20 algorithms algorithms NOUN ital-3222 146 21 it it PRON ital-3222 146 22 is be AUX ital-3222 146 23 based base VERB ital-3222 146 24 on on ADP ital-3222 146 25 . . PUNCT ital-3222 147 1 the the DET ital-3222 147 2 scale scale NOUN ital-3222 147 3 of of ADP ital-3222 147 4 improvement improvement NOUN ital-3222 147 5 varies varie NOUN ital-3222 147 6 between between ADP ital-3222 147 7 document document NOUN ital-3222 147 8 types type NOUN ital-3222 147 9 , , PUNCT ital-3222 147 10 but but CCONJ ital-3222 147 11 for for ADP ital-3222 147 12 most most ADJ ital-3222 147 13 of of ADP ital-3222 147 14 them they PRON ital-3222 147 15 it it PRON ital-3222 147 16 is be AUX ital-3222 147 17 more more ADJ ital-3222 147 18 than than ADP ital-3222 147 19 20 20 NUM ital-3222 147 20 percent percent NOUN ital-3222 147 21 for for ADP ital-3222 147 22 ctdl+ ctdl+ X ital-3222 147 23 and and CCONJ ital-3222 147 24 10 10 NUM ital-3222 147 25 percent percent NOUN ital-3222 147 26 for for ADP ital-3222 147 27 ctdl ctdl NOUN ital-3222 147 28 . . PUNCT ital-3222 148 1 the the DET ital-3222 148 2 smallest small ADJ ital-3222 148 3 improvement improvement NOUN ital-3222 148 4 is be AUX ital-3222 148 5 achieved achieve VERB ital-3222 148 6 in in ADP ital-3222 148 7 case case NOUN ital-3222 148 8 of of ADP ital-3222 148 9 ps ps PROPN ital-3222 148 10 ( ( PUNCT ital-3222 148 11 about about ADV ital-3222 148 12 5 5 NUM ital-3222 148 13 percent percent NOUN ital-3222 148 14 ) ) PUNCT ital-3222 148 15 . . PUNCT ital-3222 149 1 figure figure VERB ital-3222 149 2 1 1 NUM ital-3222 149 3 shows show VERB ital-3222 149 4 the the DET ital-3222 149 5 same same ADJ ital-3222 149 6 results result NOUN ital-3222 149 7 in in ADP ital-3222 149 8 another another DET ital-3222 149 9 perspective perspective NOUN ital-3222 149 10 : : PUNCT ital-3222 149 11 the the DET ital-3222 149 12 bars bar NOUN ital-3222 149 13 show show VERB ital-3222 149 14 how how SCONJ ital-3222 149 15 much much ADV ital-3222 149 16 better well ADJ ital-3222 149 17 compression compression NOUN ital-3222 149 18 ratios ratio NOUN ital-3222 149 19 were be AUX ital-3222 149 20 obtained obtain VERB ital-3222 149 21 for for ADP ital-3222 149 22 the the DET ital-3222 149 23 same same ADJ ital-3222 149 24 documents document NOUN ital-3222 149 25 using use VERB ital-3222 149 26 different different ADJ ital-3222 149 27 compression compression NOUN ital-3222 149 28 schemes scheme NOUN ital-3222 149 29 compared compare VERB ital-3222 149 30 to to ADP ital-3222 149 31 gzip gzip NOUN ital-3222 149 32 with with ADP ital-3222 149 33 default default NOUN ital-3222 149 34 options option NOUN ital-3222 149 35 ( ( PUNCT ital-3222 149 36 0 0 NUM ital-3222 149 37 percent percent NOUN ital-3222 149 38 means mean VERB ital-3222 149 39 no no DET ital-3222 149 40 improvement improvement NOUN ital-3222 149 41 ) ) PUNCT ital-3222 149 42 . . PUNCT ital-3222 150 1 compared compare VERB ital-3222 150 2 to to ADP ital-3222 150 3 gzip gzip NOUN ital-3222 150 4 , , PUNCT ital-3222 150 5 ctdl ctdl PROPN ital-3222 150 6 offers offer VERB ital-3222 150 7 a a DET ital-3222 150 8 significantly significantly ADV ital-3222 150 9 better well ADJ ital-3222 150 10 compression compression NOUN ital-3222 150 11 ratio ratio NOUN ital-3222 150 12 at at ADP ital-3222 150 13 the the DET ital-3222 150 14 expense expense NOUN ital-3222 150 15 of of ADP ital-3222 150 16 longer long ADV ital-3222 150 17 processing process VERB ital-3222 150 18 time time NOUN ital-3222 150 19 . . PUNCT ital-3222 151 1 the the DET ital-3222 151 2 relative relative ADJ ital-3222 151 3 difference difference NOUN ital-3222 151 4 is be AUX ital-3222 151 5 especially especially ADV ital-3222 151 6 high high ADJ ital-3222 151 7 in in ADP ital-3222 151 8 case case NOUN ital-3222 151 9 of of ADP ital-3222 151 10 decompression decompression NOUN ital-3222 151 11 . . PUNCT ital-3222 152 1 however however ADV ital-3222 152 2 , , PUNCT ital-3222 152 3 in in ADP ital-3222 152 4 absolute absolute ADJ ital-3222 152 5 terms term NOUN ital-3222 152 6 , , PUNCT ital-3222 152 7 even even ADV ital-3222 152 8 in in ADP ital-3222 152 9 the the DET ital-3222 152 10 worst bad ADJ ital-3222 152 11 case case NOUN ital-3222 152 12 of of ADP ital-3222 152 13 pdf pdf NOUN ital-3222 152 14 , , PUNCT ital-3222 152 15 the the DET ital-3222 152 16 average average ADJ ital-3222 152 17 delay delay NOUN ital-3222 152 18 between between ADP ital-3222 152 19 ctdl+ ctdl+ PROPN ital-3222 152 20 and and CCONJ ital-3222 152 21 gzip gzip PROPN ital-3222 152 22 is be AUX ital-3222 152 23 below below ADP ital-3222 152 24 180 180 NUM ital-3222 152 25 ms ms NOUN ital-3222 152 26 for for ADP ital-3222 152 27 compression compression NOUN ital-3222 152 28 and and CCONJ ital-3222 152 29 90 90 NUM ital-3222 152 30 ms ms NOUN ital-3222 152 31 for for ADP ital-3222 152 32 decompression decompression NOUN ital-3222 152 33 per per ADP ital-3222 152 34 file file NOUN ital-3222 152 35 . . PUNCT ital-3222 153 1 taking take VERB ital-3222 153 2 into into ADP ital-3222 153 3 consideration consideration NOUN ital-3222 153 4 the the DET ital-3222 153 5 low low ADJ ital-3222 153 6 - - PUNCT ital-3222 153 7 end end NOUN ital-3222 153 8 specification specification NOUN ital-3222 153 9 of of ADP ital-3222 153 10 the the DET ital-3222 153 11 test test NOUN ital-3222 153 12 computer computer NOUN ital-3222 153 13 , , PUNCT ital-3222 153 14 these these DET ital-3222 153 15 results result NOUN ital-3222 153 16 table table NOUN ital-3222 153 17 6 6 NUM ital-3222 153 18 . . NOUN ital-3222 153 19 compression compression NOUN ital-3222 153 20 efficiency efficiency NOUN ital-3222 153 21 and and CCONJ ital-3222 153 22 times time NOUN ital-3222 153 23 for for SCONJ ital-3222 153 24 the the DET ital-3222 153 25 html html PROPN ital-3222 153 26 documents document NOUN ital-3222 153 27 deflate deflate VERB ital-3222 153 28 lzma lzma PROPN ital-3222 153 29 bzip2 bzip2 NOUN ital-3222 153 30 ppmvc ppmvc NOUN ital-3222 153 31 file file NOUN ital-3222 153 32 name name NOUN ital-3222 153 33 gzip gzip PROPN ital-3222 153 34 ctdl ctdl PROPN ital-3222 153 35 ctdl+ ctdl+ X ital-3222 154 1 7 7 NUM ital-3222 154 2 - - PUNCT ital-3222 154 3 zip zip NOUN ital-3222 154 4 ctdl ctdl NOUN ital-3222 154 5 ctdl+ ctdl+ X ital-3222 154 6 13601 13601 NUM ital-3222 154 7 - - PUNCT ital-3222 154 8 t t NOUN ital-3222 154 9 2.696 2.696 NUM ital-3222 154 10 2.054 2.054 NUM ital-3222 154 11 1.940 1.940 NUM ital-3222 154 12 2.121 2.121 NUM ital-3222 154 13 1.868 1.868 NUM ital-3222 154 14 1.751 1.751 NUM ital-3222 154 15 1.932 1.932 NUM ital-3222 154 16 1.670 1.670 NUM ital-3222 154 17 16514 16514 NUM ital-3222 154 18 - - PUNCT ital-3222 154 19 t t NOUN ital-3222 154 20 1.726 1.726 NUM ital-3222 154 21 1.405 1.405 NUM ital-3222 154 22 1.310 1.310 NUM ital-3222 154 23 1.436 1.436 NUM ital-3222 154 24 1.258 1.258 NUM ital-3222 154 25 1.180 1.180 NUM ital-3222 154 26 1.257 1.257 NUM ital-3222 154 27 1.113 1.113 NUM ital-3222 154 28 1noam10 1noam10 NUM ital-3222 154 29 t t NOUN ital-3222 155 1 2.768 2.768 NUM ital-3222 155 2 2.159 2.159 NUM ital-3222 155 3 1.972 1.972 NUM ital-3222 155 4 2.244 2.244 NUM ital-3222 155 5 1.979 1.979 NUM ital-3222 155 6 1.815 1.815 NUM ital-3222 155 7 1.973 1.973 NUM ital-3222 155 8 1.785 1.785 NUM ital-3222 155 9 2ws2610 2ws2610 NUM ital-3222 155 10 2.084 2.084 NUM ital-3222 155 11 1.747 1.747 NUM ital-3222 155 12 1.504 1.504 NUM ital-3222 155 13 1.743 1.743 NUM ital-3222 155 14 1.525 1.525 NUM ital-3222 155 15 1.344 1.344 NUM ital-3222 155 16 1.499 1.499 NUM ital-3222 155 17 1.303 1.303 NUM ital-3222 155 18 alice30 alice30 NOUN ital-3222 155 19 2.451 2.451 NUM ital-3222 155 20 2.124 2.124 NUM ital-3222 155 21 1.829 1.829 NUM ital-3222 155 22 2.128 2.128 NUM ital-3222 155 23 1.929 1.929 NUM ital-3222 155 24 1.701 1.701 NUM ital-3222 155 25 1.888 1.888 NUM ital-3222 155 26 1.684 1.684 NUM ital-3222 155 27 cdscs10 cdscs10 ADJ ital-3222 155 28 t t NOUN ital-3222 155 29 2.880 2.880 NUM ital-3222 155 30 2.593 2.593 NUM ital-3222 155 31 2.084 2.084 NUM ital-3222 155 32 2.597 2.597 NUM ital-3222 155 33 2.410 2.410 NUM ital-3222 155 34 1.966 1.966 NUM ital-3222 155 35 2.348 2.348 NUM ital-3222 155 36 2.131 2.131 NUM ital-3222 155 37 grimm10 grimm10 NOUN ital-3222 155 38 t t NOUN ital-3222 155 39 2.603 2.603 NUM ital-3222 155 40 2.074 2.074 NUM ital-3222 155 41 1.916 1.916 NUM ital-3222 155 42 2.138 2.138 NUM ital-3222 155 43 1.883 1.883 NUM ital-3222 155 44 1.752 1.752 NUM ital-3222 155 45 1.889 1.889 NUM ital-3222 155 46 1.688 1.688 NUM ital-3222 155 47 pandp12 pandp12 NOUN ital-3222 155 48 t t NOUN ital-3222 155 49 2.640 2.640 NUM ital-3222 155 50 2.037 2.037 NUM ital-3222 155 51 1.891 1.891 NUM ital-3222 155 52 2.120 2.120 NUM ital-3222 155 53 1.826 1.826 NUM ital-3222 155 54 1.717 1.717 NUM ital-3222 155 55 1.777 1.777 NUM ital-3222 155 56 1.596 1.596 NUM ital-3222 155 57 average average ADJ ital-3222 155 58 2.481 2.481 NUM ital-3222 155 59 2.024 2.024 NUM ital-3222 155 60 1.806 1.806 NUM ital-3222 155 61 2.066 2.066 NUM ital-3222 155 62 1.835 1.835 NUM ital-3222 155 63 1.653 1.653 NUM ital-3222 155 64 1.820 1.820 NUM ital-3222 155 65 1.621 1.621 NUM ital-3222 155 66 comp comp NOUN ital-3222 155 67 . . PUNCT ital-3222 156 1 time time NOUN ital-3222 156 2 0.750 0.750 NUM ital-3222 156 3 1.438 1.438 NUM ital-3222 156 4 1.078 1.078 NUM ital-3222 156 5 8.203 8.203 NUM ital-3222 156 6 3.421 3.421 NUM ital-3222 156 7 3.328 3.328 NUM ital-3222 156 8 2.672 2.672 NUM ital-3222 156 9 3.500 3.500 NUM ital-3222 156 10 dec dec PROPN ital-3222 156 11 . . PROPN ital-3222 156 12 time time PROPN ital-3222 156 13 0.140 0.140 NUM ital-3222 156 14 0.515 0.515 NUM ital-3222 156 15 0.594 0.594 NUM ital-3222 156 16 0.359 0.359 NUM ital-3222 156 17 0.688 0.688 NUM ital-3222 156 18 0.750 0.750 NUM ital-3222 156 19 0.812 0.812 NUM ital-3222 156 20 3.672 3.672 NUM ital-3222 156 21 table table NOUN ital-3222 156 22 7 7 NUM ital-3222 156 23 . . PROPN ital-3222 156 24 compression compression NOUN ital-3222 156 25 efficiency efficiency NOUN ital-3222 156 26 and and CCONJ ital-3222 156 27 times time NOUN ital-3222 156 28 for for ADP ital-3222 156 29 the the DET ital-3222 156 30 rtf rtf PROPN ital-3222 156 31 documents document NOUN ital-3222 156 32 deflate deflate VERB ital-3222 156 33 lzma lzma PROPN ital-3222 156 34 bzip2 bzip2 NOUN ital-3222 156 35 ppmvc ppmvc NOUN ital-3222 156 36 file file NOUN ital-3222 156 37 name name NOUN ital-3222 156 38 gzip gzip PROPN ital-3222 156 39 ctdl ctdl PROPN ital-3222 156 40 ctdl+ ctdl+ X ital-3222 157 1 7 7 NUM ital-3222 157 2 - - PUNCT ital-3222 157 3 zip zip NOUN ital-3222 157 4 ctdl ctdl NOUN ital-3222 157 5 ctdl+ ctdl+ X ital-3222 157 6 13601 13601 NUM ital-3222 157 7 - - PUNCT ital-3222 157 8 t t NOUN ital-3222 157 9 1.882 1.882 NUM ital-3222 157 10 1.431 1.431 NUM ital-3222 157 11 1.372 1.372 NUM ital-3222 157 12 1.428 1.428 NUM ital-3222 157 13 1.267 1.267 NUM ital-3222 157 14 1.200 1.200 NUM ital-3222 157 15 1.300 1.300 NUM ital-3222 157 16 1.120 1.120 NUM ital-3222 157 17 16514 16514 NUM ital-3222 157 18 - - PUNCT ital-3222 157 19 t t NOUN ital-3222 157 20 0.834 0.834 NUM ital-3222 157 21 0.701 0.701 NUM ital-3222 158 1 0.696 0.696 NUM ital-3222 158 2 0.662 0.662 NUM ital-3222 158 3 0.601 0.601 NUM ital-3222 158 4 0.591 0.591 NUM ital-3222 158 5 0.568 0.568 NUM ital-3222 158 6 0.529 0.529 NUM ital-3222 159 1 1noam10 1noam10 NUM ital-3222 159 2 t t NOUN ital-3222 159 3 2.244 2.244 NUM ital-3222 159 4 1.774 1.774 NUM ital-3222 159 5 1.637 1.637 NUM ital-3222 159 6 1.765 1.765 NUM ital-3222 159 7 1.594 1.594 NUM ital-3222 159 8 1.462 1.462 NUM ital-3222 159 9 1.601 1.601 NUM ital-3222 159 10 1.404 1.404 NUM ital-3222 159 11 2ws2610 2ws2610 NUM ital-3222 159 12 0.784 0.784 NUM ital-3222 159 13 0.630 0.630 NUM ital-3222 159 14 0.581 0.581 NUM ital-3222 159 15 0.629 0.629 NUM ital-3222 159 16 0.545 0.545 NUM ital-3222 159 17 0.500 0.500 NUM ital-3222 159 18 0.520 0.520 NUM ital-3222 159 19 0.485 0.485 NUM ital-3222 159 20 alice30 alice30 NOUN ital-3222 159 21 1.382 1.382 NUM ital-3222 159 22 1.196 1.196 NUM ital-3222 159 23 1.065 1.065 NUM ital-3222 159 24 1.134 1.134 NUM ital-3222 159 25 1.046 1.046 NUM ital-3222 159 26 0.948 0.948 NUM ital-3222 159 27 0.995 0.995 NUM ital-3222 159 28 0.922 0.922 NUM ital-3222 159 29 cdscs10 cdscs10 ADJ ital-3222 159 30 t t NOUN ital-3222 160 1 2.059 2.059 NUM ital-3222 160 2 1.882 1.882 NUM ital-3222 160 3 1.558 1.558 NUM ital-3222 160 4 1.784 1.784 NUM ital-3222 160 5 1.704 1.704 NUM ital-3222 160 6 1.432 1.432 NUM ital-3222 160 7 1.645 1.645 NUM ital-3222 160 8 1.488 1.488 NUM ital-3222 160 9 grimm10 grimm10 NOUN ital-3222 160 10 t t NOUN ital-3222 160 11 1.618 1.618 NUM ital-3222 160 12 1.301 1.301 NUM ital-3222 160 13 1.227 1.227 NUM ital-3222 160 14 1.285 1.285 NUM ital-3222 160 15 1.150 1.150 NUM ital-3222 160 16 1.082 1.082 NUM ital-3222 160 17 1.149 1.149 NUM ital-3222 160 18 1.010 1.010 NUM ital-3222 160 19 pandp12 pandp12 NOUN ital-3222 160 20 t t NOUN ital-3222 160 21 1.742 1.742 NUM ital-3222 160 22 1.340 1.340 NUM ital-3222 160 23 1.264 1.264 NUM ital-3222 160 24 1.336 1.336 NUM ital-3222 160 25 1.169 1.169 NUM ital-3222 160 26 1.115 1.115 NUM ital-3222 160 27 1.142 1.142 NUM ital-3222 160 28 1.012 1.012 NUM ital-3222 160 29 average average NOUN ital-3222 160 30 1.568 1.568 NUM ital-3222 160 31 1.282 1.282 NUM ital-3222 160 32 1.175 1.175 NUM ital-3222 160 33 1.253 1.253 NUM ital-3222 160 34 1.135 1.135 NUM ital-3222 160 35 1.041 1.041 NUM ital-3222 160 36 1.115 1.115 NUM ital-3222 160 37 0.996 0.996 NUM ital-3222 160 38 comp comp NOUN ital-3222 160 39 . . PUNCT ital-3222 161 1 time time NOUN ital-3222 161 2 0.766 0.766 NUM ital-3222 161 3 2.047 2.047 NUM ital-3222 161 4 1.500 1.500 NUM ital-3222 161 5 12.62 12.62 NUM ital-3222 161 6 6.500 6.500 NUM ital-3222 161 7 7.562 7.562 NUM ital-3222 161 8 8.032 8.032 NUM ital-3222 161 9 3.922 3.922 NUM ital-3222 161 10 dec dec PROPN ital-3222 161 11 . . PROPN ital-3222 161 12 time time NOUN ital-3222 161 13 0.156 0.156 NUM ital-3222 161 14 0.688 0.688 NUM ital-3222 161 15 0.766 0.766 NUM ital-3222 161 16 0.469 0.469 NUM ital-3222 161 17 0.875 0.875 NUM ital-3222 161 18 0.953 0.953 NUM ital-3222 161 19 1.312 1.312 NUM ital-3222 161 20 4.157 4.157 NUM ital-3222 161 21 the the DET ital-3222 161 22 efficient efficient ADJ ital-3222 161 23 storage storage NOUN ital-3222 161 24 of of ADP ital-3222 161 25 text text NOUN ital-3222 161 26 documents document NOUN ital-3222 161 27 in in ADP ital-3222 161 28 digital digital ADJ ital-3222 161 29 libraries library NOUN ital-3222 161 30 | | NOUN ital-3222 161 31 skibiński skibiński NOUN ital-3222 161 32 and and CCONJ ital-3222 161 33 swacha swacha PROPN ital-3222 161 34 151 151 NUM ital-3222 161 35 certainly certainly ADV ital-3222 161 36 seem seem VERB ital-3222 161 37 good good ADJ ital-3222 161 38 enough enough ADV ital-3222 161 39 for for ADP ital-3222 161 40 practical practical ADJ ital-3222 161 41 applications application NOUN ital-3222 161 42 . . PUNCT ital-3222 162 1 compared compare VERB ital-3222 162 2 to to ADP ital-3222 162 3 lzma lzma PROPN ital-3222 162 4 , , PUNCT ital-3222 162 5 ctdl ctdl PROPN ital-3222 162 6 offers offer VERB ital-3222 162 7 better well ADJ ital-3222 162 8 compression compression NOUN ital-3222 162 9 and and CCONJ ital-3222 162 10 a a DET ital-3222 162 11 shorter short ADJ ital-3222 162 12 compression compression NOUN ital-3222 162 13 time time NOUN ital-3222 162 14 at at ADP ital-3222 162 15 the the DET ital-3222 162 16 expense expense NOUN ital-3222 162 17 of of ADP ital-3222 162 18 longer long ADJ ital-3222 162 19 decompression decompression NOUN ital-3222 162 20 time time NOUN ital-3222 162 21 . . PUNCT ital-3222 163 1 notice notice VERB ital-3222 163 2 that that SCONJ ital-3222 163 3 the the DET ital-3222 163 4 absolute absolute ADJ ital-3222 163 5 gain gain NOUN ital-3222 163 6 in in ADP ital-3222 163 7 compression compression NOUN ital-3222 163 8 time time NOUN ital-3222 163 9 is be AUX ital-3222 163 10 several several ADJ ital-3222 163 11 times time NOUN ital-3222 163 12 the the DET ital-3222 163 13 loss loss NOUN ital-3222 163 14 in in ADP ital-3222 163 15 decompression decompression NOUN ital-3222 163 16 time time NOUN ital-3222 163 17 , , PUNCT ital-3222 163 18 and and CCONJ ital-3222 163 19 the the DET ital-3222 163 20 decompression decompression NOUN ital-3222 163 21 time time NOUN ital-3222 163 22 remains remain VERB ital-3222 163 23 short short ADJ ital-3222 163 24 , , PUNCT ital-3222 163 25 noticeably noticeably ADV ital-3222 163 26 shorter short ADJ ital-3222 163 27 than than ADP ital-3222 163 28 bzip2 bzip2 NOUN ital-3222 163 29 ’s ’s NUM ital-3222 163 30 and and CCONJ ital-3222 163 31 several several ADJ ital-3222 163 32 times time NOUN ital-3222 163 33 shorter short ADJ ital-3222 163 34 than than ADP ital-3222 163 35 ppmvc ppmvc NOUN ital-3222 163 36 ’s ’s NUM ital-3222 163 37 . . PUNCT ital-3222 164 1 ctdl+ ctdl+ VERB ital-3222 164 2 beats beat NOUN ital-3222 164 3 bzip2 bzip2 NOUN ital-3222 164 4 ( ( PUNCT ital-3222 164 5 with with ADP ital-3222 164 6 the the DET ital-3222 164 7 sole sole ADJ ital-3222 164 8 exception exception NOUN ital-3222 164 9 of of ADP ital-3222 164 10 ps ps PROPN ital-3222 164 11 documents document NOUN ital-3222 164 12 ) ) PUNCT ital-3222 164 13 in in ADP ital-3222 164 14 terms term NOUN ital-3222 164 15 of of ADP ital-3222 164 16 compression compression NOUN ital-3222 164 17 ratio ratio NOUN ital-3222 164 18 and and CCONJ ital-3222 164 19 achieves achieve VERB ital-3222 164 20 results result NOUN ital-3222 164 21 that that PRON ital-3222 164 22 are be AUX ital-3222 164 23 mostly mostly ADV ital-3222 164 24 very very ADV ital-3222 164 25 close close ADJ ital-3222 164 26 to to ADP ital-3222 164 27 the the DET ital-3222 164 28 resourcehungry resourcehungry ADJ ital-3222 164 29 ppmvc ppmvc NOUN ital-3222 164 30 . . PUNCT ital-3222 165 1 n n X ital-3222 165 2 conclusions conclusion NOUN ital-3222 165 3 in in ADP ital-3222 165 4 this this DET ital-3222 165 5 paper paper NOUN ital-3222 165 6 we we PRON ital-3222 165 7 addressed address VERB ital-3222 165 8 the the DET ital-3222 165 9 problem problem NOUN ital-3222 165 10 of of ADP ital-3222 165 11 compressing compress VERB ital-3222 165 12 text text NOUN ital-3222 165 13 documents document NOUN ital-3222 165 14 . . PUNCT ital-3222 166 1 although although SCONJ ital-3222 166 2 individual individual ADJ ital-3222 166 3 text text NOUN ital-3222 166 4 documents document NOUN ital-3222 166 5 rarely rarely ADV ital-3222 166 6 exceed exceed VERB ital-3222 166 7 several several ADJ ital-3222 166 8 megabytes megabyte NOUN ital-3222 166 9 in in ADP ital-3222 166 10 size size NOUN ital-3222 166 11 , , PUNCT ital-3222 166 12 their their PRON ital-3222 166 13 entire entire ADJ ital-3222 166 14 collections collection NOUN ital-3222 166 15 can can AUX ital-3222 166 16 have have VERB ital-3222 166 17 very very ADV ital-3222 166 18 large large ADJ ital-3222 166 19 storage storage NOUN ital-3222 166 20 space space NOUN ital-3222 166 21 requirements requirement NOUN ital-3222 166 22 . . PUNCT ital-3222 167 1 although although SCONJ ital-3222 167 2 text text NOUN ital-3222 167 3 documents document NOUN ital-3222 167 4 are be AUX ital-3222 167 5 often often ADV ital-3222 167 6 compressed compress VERB ital-3222 167 7 with with ADP ital-3222 167 8 general general ADJ ital-3222 167 9 - - PUNCT ital-3222 167 10 purpose purpose NOUN ital-3222 167 11 methods method NOUN ital-3222 167 12 such such ADJ ital-3222 167 13 as as ADP ital-3222 167 14 deflate deflate ADJ ital-3222 167 15 , , PUNCT ital-3222 167 16 much much ADV ital-3222 167 17 better well ADJ ital-3222 167 18 compression compression NOUN ital-3222 167 19 can can AUX ital-3222 167 20 be be AUX ital-3222 167 21 obtained obtain VERB ital-3222 167 22 with with ADP ital-3222 167 23 a a DET ital-3222 167 24 scheme scheme NOUN ital-3222 167 25 specialized specialize VERB ital-3222 167 26 for for ADP ital-3222 167 27 text text NOUN ital-3222 167 28 , , PUNCT ital-3222 167 29 and and CCONJ ital-3222 167 30 even even ADV ital-3222 167 31 better well ADJ ital-3222 167 32 if if SCONJ ital-3222 167 33 the the DET ital-3222 167 34 scheme scheme NOUN ital-3222 167 35 is be AUX ital-3222 167 36 additionally additionally ADV ital-3222 167 37 specialized specialize VERB ital-3222 167 38 for for ADP ital-3222 167 39 individual individual ADJ ital-3222 167 40 document document NOUN ital-3222 167 41 formats format NOUN ital-3222 167 42 . . PUNCT ital-3222 168 1 we we PRON ital-3222 168 2 have have AUX ital-3222 168 3 developed develop VERB ital-3222 168 4 such such DET ital-3222 168 5 a a DET ital-3222 168 6 scheme scheme NOUN ital-3222 168 7 ( ( PUNCT ital-3222 168 8 ctdl ctdl NOUN ital-3222 168 9 ) ) PUNCT ital-3222 168 10 , , PUNCT ital-3222 168 11 beginning begin VERB ital-3222 168 12 with with ADP ital-3222 168 13 a a DET ital-3222 168 14 text text NOUN ital-3222 168 15 transform transform NOUN ital-3222 168 16 designed design VERB ital-3222 168 17 earlier early ADV ital-3222 168 18 for for ADP ital-3222 168 19 xml xml NOUN ital-3222 168 20 documents document NOUN ital-3222 168 21 and and CCONJ ital-3222 168 22 table table NOUN ital-3222 168 23 8 8 NUM ital-3222 168 24 . . NOUN ital-3222 168 25 compression compression NOUN ital-3222 168 26 efficiency efficiency NOUN ital-3222 168 27 and and CCONJ ital-3222 168 28 times time NOUN ital-3222 168 29 for for ADP ital-3222 168 30 the the DET ital-3222 168 31 doc doc PROPN ital-3222 168 32 documents document NOUN ital-3222 168 33 deflate deflate VERB ital-3222 168 34 lzma lzma PROPN ital-3222 168 35 bzip2 bzip2 NOUN ital-3222 168 36 ppmvc ppmvc NOUN ital-3222 168 37 file file NOUN ital-3222 168 38 name name NOUN ital-3222 168 39 gzip gzip PROPN ital-3222 168 40 ctdl ctdl PROPN ital-3222 168 41 ctdl+ ctdl+ X ital-3222 169 1 7 7 NUM ital-3222 169 2 - - PUNCT ital-3222 169 3 zip zip NOUN ital-3222 169 4 ctdl ctdl NOUN ital-3222 169 5 ctdl+ ctdl+ X ital-3222 169 6 13601 13601 NUM ital-3222 169 7 - - PUNCT ital-3222 169 8 t t NOUN ital-3222 169 9 2.798 2.798 NUM ital-3222 169 10 2.183 2.183 NUM ital-3222 169 11 2.062 2.062 NUM ital-3222 169 12 2.181 2.181 NUM ital-3222 169 13 1.976 1.976 NUM ital-3222 169 14 1.854 1.854 NUM ital-3222 169 15 2.115 2.115 NUM ital-3222 169 16 1.818 1.818 NUM ital-3222 169 17 16514 16514 NUM ital-3222 169 18 - - PUNCT ital-3222 169 19 t t NOUN ital-3222 169 20 2.226 2.226 NUM ital-3222 169 21 2.213 2.213 NUM ital-3222 169 22 2.073 2.073 NUM ital-3222 169 23 1.712 1.712 NUM ital-3222 169 24 1.712 1.712 NUM ital-3222 169 25 1.652 1.652 NUM ital-3222 169 26 1.919 1.919 NUM ital-3222 169 27 1.686 1.686 NUM ital-3222 169 28 1noam10 1noam10 NUM ital-3222 169 29 t t NOUN ital-3222 169 30 2.851 2.851 NUM ital-3222 169 31 2.250 2.250 NUM ital-3222 169 32 2.025 2.025 NUM ital-3222 169 33 2.289 2.289 NUM ital-3222 169 34 2.057 2.057 NUM ital-3222 169 35 1.869 1.869 NUM ital-3222 169 36 2.113 2.113 NUM ital-3222 169 37 1.870 1.870 NUM ital-3222 169 38 2ws2610 2ws2610 NUM ital-3222 170 1 2.497 2.497 NUM ital-3222 170 2 2.499 2.499 NUM ital-3222 170 3 2.210 2.210 NUM ital-3222 170 4 2.095 2.095 NUM ital-3222 170 5 2.095 2.095 NUM ital-3222 170 6 1.890 1.890 NUM ital-3222 170 7 2.251 2.251 NUM ital-3222 170 8 1.999 1.999 NUM ital-3222 170 9 alice30 alice30 NOUN ital-3222 171 1 2.744 2.744 NUM ital-3222 171 2 2.714 2.714 NUM ital-3222 171 3 2.270 2.270 NUM ital-3222 171 4 2.345 2.345 NUM ital-3222 171 5 2.345 2.345 NUM ital-3222 171 6 2.038 2.038 NUM ital-3222 171 7 2.348 2.348 NUM ital-3222 171 8 2.058 2.058 NUM ital-3222 171 9 cdscs10 cdscs10 ADJ ital-3222 171 10 t t NOUN ital-3222 171 11 2.916 2.916 NUM ital-3222 171 12 2.891 2.891 NUM ital-3222 171 13 2.231 2.231 NUM ital-3222 171 14 2.559 2.559 NUM ital-3222 171 15 2.560 2.560 NUM ital-3222 171 16 2.062 2.062 NUM ital-3222 171 17 2.475 2.475 NUM ital-3222 171 18 2.196 2.196 NUM ital-3222 171 19 grimm10 grimm10 NOUN ital-3222 171 20 t t NOUN ital-3222 171 21 2.691 2.691 NUM ital-3222 171 22 2.677 2.677 NUM ital-3222 171 23 2.059 2.059 NUM ital-3222 171 24 2.179 2.179 NUM ital-3222 171 25 2.179 2.179 NUM ital-3222 171 26 1.856 1.856 NUM ital-3222 171 27 2.075 2.075 NUM ital-3222 171 28 1.833 1.833 NUM ital-3222 171 29 pandp12 pandp12 NOUN ital-3222 171 30 t t NOUN ital-3222 171 31 2.761 2.761 NUM ital-3222 171 32 2.171 2.171 NUM ital-3222 171 33 2.050 2.050 NUM ital-3222 171 34 2.189 2.189 NUM ital-3222 171 35 1.955 1.955 NUM ital-3222 171 36 1.843 1.843 NUM ital-3222 171 37 1.983 1.983 NUM ital-3222 171 38 1.770 1.770 NUM ital-3222 171 39 average average ADJ ital-3222 171 40 2.686 2.686 NUM ital-3222 171 41 2.450 2.450 NUM ital-3222 171 42 2.123 2.123 NUM ital-3222 171 43 2.194 2.194 NUM ital-3222 171 44 2.110 2.110 NUM ital-3222 171 45 1.883 1.883 NUM ital-3222 171 46 2.160 2.160 NUM ital-3222 171 47 1.904 1.904 NUM ital-3222 171 48 comp comp NOUN ital-3222 171 49 . . PUNCT ital-3222 172 1 time time NOUN ital-3222 172 2 0.718 0.718 NUM ital-3222 172 3 1.312 1.312 NUM ital-3222 172 4 1.031 1.031 NUM ital-3222 172 5 7.078 7.078 NUM ital-3222 172 6 4.063 4.063 NUM ital-3222 172 7 3.001 3.001 NUM ital-3222 172 8 2.250 2.250 NUM ital-3222 172 9 3.421 3.421 NUM ital-3222 172 10 dec dec PROPN ital-3222 172 11 . . PROPN ital-3222 172 12 time time PROPN ital-3222 172 13 0.125 0.125 NUM ital-3222 172 14 0.375 0.375 NUM ital-3222 172 15 0.547 0.547 NUM ital-3222 172 16 0.344 0.344 NUM ital-3222 172 17 0.547 0.547 NUM ital-3222 172 18 0.718 0.718 NUM ital-3222 172 19 0.735 0.735 NUM ital-3222 172 20 3.625 3.625 NUM ital-3222 172 21 table table NOUN ital-3222 172 22 9 9 NUM ital-3222 172 23 . . NOUN ital-3222 172 24 compression compression NOUN ital-3222 172 25 efficiency efficiency NOUN ital-3222 172 26 and and CCONJ ital-3222 172 27 times time NOUN ital-3222 172 28 for for SCONJ ital-3222 172 29 the the DET ital-3222 172 30 ps ps PROPN ital-3222 172 31 documents document NOUN ital-3222 172 32 deflate deflate VERB ital-3222 172 33 lzma lzma PROPN ital-3222 172 34 bzip2 bzip2 NOUN ital-3222 172 35 ppmvc ppmvc NOUN ital-3222 172 36 file file NOUN ital-3222 172 37 name name NOUN ital-3222 172 38 gzip gzip PROPN ital-3222 172 39 ctdl ctdl PROPN ital-3222 172 40 ctdl+ ctdl+ X ital-3222 173 1 7 7 NUM ital-3222 173 2 - - PUNCT ital-3222 173 3 zip zip NOUN ital-3222 173 4 ctdl ctdl NOUN ital-3222 173 5 ctdl+ ctdl+ X ital-3222 173 6 13601 13601 NUM ital-3222 173 7 - - PUNCT ital-3222 173 8 t t NOUN ital-3222 173 9 2.847 2.847 NUM ital-3222 173 10 2.634 2.634 NUM ital-3222 173 11 2.589 2.589 NUM ital-3222 173 12 2.213 2.213 NUM ital-3222 173 13 2.105 2.105 NUM ital-3222 173 14 2.074 2.074 NUM ital-3222 173 15 2.011 2.011 NUM ital-3222 173 16 1.778 1.778 NUM ital-3222 173 17 16514 16514 NUM ital-3222 173 18 - - PUNCT ital-3222 173 19 t t NOUN ital-3222 173 20 3.226 3.226 NUM ital-3222 173 21 3.129 3.129 NUM ital-3222 173 22 3.039 3.039 NUM ital-3222 173 23 2.730 2.730 NUM ital-3222 173 24 2.707 2.707 NUM ital-3222 173 25 2.699 2.699 NUM ital-3222 173 26 2.613 2.613 NUM ital-3222 173 27 2.505 2.505 NUM ital-3222 173 28 1noam10 1noam10 NUM ital-3222 173 29 t t NOUN ital-3222 173 30 2.718 2.718 NUM ital-3222 173 31 2.551 2.551 NUM ital-3222 173 32 2.490 2.490 NUM ital-3222 173 33 2.147 2.147 NUM ital-3222 173 34 2.060 2.060 NUM ital-3222 173 35 2.015 2.015 NUM ital-3222 173 36 1.892 1.892 NUM ital-3222 173 37 1.694 1.694 NUM ital-3222 173 38 2ws2610 2ws2610 NUM ital-3222 173 39 3.064 3.064 NUM ital-3222 173 40 2.922 2.922 NUM ital-3222 173 41 2.795 2.795 NUM ital-3222 173 42 2.600 2.600 NUM ital-3222 173 43 2.521 2.521 NUM ital-3222 173 44 2.450 2.450 NUM ital-3222 173 45 2.336 2.336 NUM ital-3222 173 46 2.186 2.186 NUM ital-3222 173 47 alice30 alice30 NOUN ital-3222 173 48 3.224 3.224 NUM ital-3222 173 49 3.154 3.154 NUM ital-3222 173 50 3.026 3.026 NUM ital-3222 173 51 2.750 2.750 NUM ital-3222 173 52 2.745 2.745 NUM ital-3222 173 53 2.691 2.691 NUM ital-3222 173 54 2.553 2.553 NUM ital-3222 173 55 2.400 2.400 NUM ital-3222 173 56 cdscs10 cdscs10 ADJ ital-3222 173 57 t t NOUN ital-3222 173 58 3.110 3.110 NUM ital-3222 173 59 3.029 3.029 NUM ital-3222 173 60 2.890 2.890 NUM ital-3222 173 61 2.657 2.657 NUM ital-3222 173 62 2.683 2.683 NUM ital-3222 173 63 2.579 2.579 NUM ital-3222 173 64 2.447 2.447 NUM ital-3222 173 65 2.276 2.276 NUM ital-3222 173 66 grimm10 grimm10 NOUN ital-3222 173 67 t t NOUN ital-3222 173 68 2.833 2.833 NUM ital-3222 173 69 2.664 2.664 NUM ital-3222 173 70 2.597 2.597 NUM ital-3222 173 71 2.288 2.288 NUM ital-3222 173 72 2.200 2.200 NUM ital-3222 173 73 2.162 2.162 NUM ital-3222 173 74 2.074 2.074 NUM ital-3222 173 75 1.863 1.863 NUM ital-3222 173 76 pandp12 pandp12 PUNCT ital-3222 173 77 t t NOUN ital-3222 173 78 2.814 2.814 NUM ital-3222 173 79 2.533 2.533 NUM ital-3222 173 80 2.468 2.468 NUM ital-3222 173 81 2.193 2.193 NUM ital-3222 173 82 2.049 2.049 NUM ital-3222 173 83 1.998 1.998 NUM ital-3222 173 84 1.858 1.858 NUM ital-3222 173 85 1.644 1.644 NUM ital-3222 173 86 average average ADJ ital-3222 173 87 2.980 2.980 NUM ital-3222 173 88 2.827 2.827 NUM ital-3222 173 89 2.737 2.737 NUM ital-3222 173 90 2.447 2.447 NUM ital-3222 173 91 2.384 2.384 NUM ital-3222 173 92 2.334 2.334 NUM ital-3222 173 93 2.223 2.223 NUM ital-3222 173 94 2.043 2.043 NUM ital-3222 173 95 comp comp NOUN ital-3222 173 96 . . PUNCT ital-3222 174 1 time time NOUN ital-3222 174 2 1.328 1.328 NUM ital-3222 174 3 3.015 3.015 NUM ital-3222 174 4 2.500 2.500 NUM ital-3222 174 5 14.23 14.23 NUM ital-3222 174 6 10.96 10.96 NUM ital-3222 174 7 11.09 11.09 NUM ital-3222 174 8 4.171 4.171 NUM ital-3222 174 9 5.765 5.765 NUM ital-3222 174 10 dec dec PROPN ital-3222 174 11 . . PROPN ital-3222 174 12 time time PROPN ital-3222 174 13 0.203 0.203 NUM ital-3222 174 14 0.688 0.688 NUM ital-3222 174 15 0.781 0.781 NUM ital-3222 174 16 0.609 0.609 NUM ital-3222 174 17 1.063 1.063 NUM ital-3222 174 18 1.125 1.125 NUM ital-3222 174 19 1.360 1.360 NUM ital-3222 174 20 6.063 6.063 NUM ital-3222 174 21 152 152 NUM ital-3222 174 22 information information NOUN ital-3222 174 23 technology technology NOUN ital-3222 174 24 and and CCONJ ital-3222 174 25 libraries library NOUN ital-3222 174 26 | | NOUN ital-3222 174 27 september september PROPN ital-3222 174 28 2009 2009 NUM ital-3222 174 29 modifying modify VERB ital-3222 174 30 it it PRON ital-3222 174 31 for for ADP ital-3222 174 32 the the DET ital-3222 174 33 requirements requirement NOUN ital-3222 174 34 of of ADP ital-3222 174 35 each each PRON ital-3222 174 36 of of ADP ital-3222 174 37 the the DET ital-3222 174 38 investigated investigate VERB ital-3222 174 39 document document NOUN ital-3222 174 40 formats format NOUN ital-3222 174 41 . . PUNCT ital-3222 175 1 it it PRON ital-3222 175 2 has have VERB ital-3222 175 3 two two NUM ital-3222 175 4 operation operation NOUN ital-3222 175 5 modes mode NOUN ital-3222 175 6 : : PUNCT ital-3222 175 7 basic basic ADJ ital-3222 175 8 ctdl ctdl NOUN ital-3222 175 9 and and CCONJ ital-3222 175 10 ctdl+ ctdl+ X ital-3222 175 11 ( ( PUNCT ital-3222 175 12 the the DET ital-3222 175 13 latter latter ADJ ital-3222 175 14 uses use VERB ital-3222 175 15 a a DET ital-3222 175 16 common common ADJ ital-3222 175 17 word word NOUN ital-3222 175 18 dictionary dictionary NOUN ital-3222 175 19 for for ADP ital-3222 175 20 improved improved ADJ ital-3222 175 21 compression compression NOUN ital-3222 175 22 ) ) PUNCT ital-3222 175 23 and and CCONJ ital-3222 175 24 uses use VERB ital-3222 175 25 two two NUM ital-3222 175 26 back back ADJ ital-3222 175 27 - - PUNCT ital-3222 175 28 end end NOUN ital-3222 175 29 compression compression NOUN ital-3222 175 30 algorithms algorithm NOUN ital-3222 175 31 : : PUNCT ital-3222 175 32 deflate deflate PROPN ital-3222 175 33 and and CCONJ ital-3222 175 34 lzma lzma PROPN ital-3222 175 35 ( ( PUNCT ital-3222 175 36 differing differ VERB ital-3222 175 37 in in ADP ital-3222 175 38 compression compression NOUN ital-3222 175 39 speed speed NOUN ital-3222 175 40 and and CCONJ ital-3222 175 41 efficiency efficiency NOUN ital-3222 175 42 ) ) PUNCT ital-3222 175 43 . . PUNCT ital-3222 176 1 the the DET ital-3222 176 2 improvement improvement NOUN ital-3222 176 3 in in ADP ital-3222 176 4 compression compression NOUN ital-3222 176 5 efficiency efficiency NOUN ital-3222 176 6 , , PUNCT ital-3222 176 7 which which PRON ital-3222 176 8 can can AUX ital-3222 176 9 be be AUX ital-3222 176 10 observed observe VERB ital-3222 176 11 in in ADP ital-3222 176 12 the the DET ital-3222 176 13 experimental experimental ADJ ital-3222 176 14 results result NOUN ital-3222 176 15 , , PUNCT ital-3222 176 16 amounts amount VERB ital-3222 176 17 to to ADP ital-3222 176 18 a a DET ital-3222 176 19 significant significant ADJ ital-3222 176 20 reduction reduction NOUN ital-3222 176 21 of of ADP ital-3222 176 22 data data NOUN ital-3222 176 23 storage storage NOUN ital-3222 176 24 requirements requirement NOUN ital-3222 176 25 , , PUNCT ital-3222 176 26 giving give VERB ital-3222 176 27 the the DET ital-3222 176 28 reasons reason NOUN ital-3222 176 29 to to PART ital-3222 176 30 use use VERB ital-3222 176 31 the the DET ital-3222 176 32 library library NOUN ital-3222 176 33 in in ADP ital-3222 176 34 both both CCONJ ital-3222 176 35 new new ADJ ital-3222 176 36 and and CCONJ ital-3222 176 37 existing exist VERB ital-3222 176 38 digital digital ADJ ital-3222 176 39 library library NOUN ital-3222 176 40 projects project NOUN ital-3222 176 41 instead instead ADV ital-3222 176 42 of of ADP ital-3222 176 43 general general ADJ ital-3222 176 44 - - PUNCT ital-3222 176 45 purpose purpose NOUN ital-3222 176 46 compression compression NOUN ital-3222 176 47 programs program NOUN ital-3222 176 48 . . PUNCT ital-3222 177 1 to to PART ital-3222 177 2 facilitate facilitate VERB ital-3222 177 3 this this DET ital-3222 177 4 process process NOUN ital-3222 177 5 , , PUNCT ital-3222 177 6 we we PRON ital-3222 177 7 implemented implement VERB ital-3222 177 8 the the DET ital-3222 177 9 scheme scheme NOUN ital-3222 177 10 as as ADP ital-3222 177 11 an an DET ital-3222 177 12 open open ADJ ital-3222 177 13 - - PUNCT ital-3222 177 14 source source NOUN ital-3222 177 15 software software NOUN ital-3222 177 16 library library NOUN ital-3222 177 17 under under ADP ital-3222 177 18 the the DET ital-3222 177 19 same same ADJ ital-3222 177 20 name name NOUN ital-3222 177 21 , , PUNCT ital-3222 177 22 freely freely ADV ital-3222 177 23 available available ADJ ital-3222 177 24 at at ADP ital-3222 177 25 http://www.ii.uni.wroc http://www.ii.uni.wroc PROPN ital-3222 177 26 . . PUNCT ital-3222 178 1 p p PROPN ital-3222 178 2 l l PROPN ital-3222 178 3 / / PUNCT ital-3222 179 1 ~ ~ PUNCT ital-3222 179 2 i i PRON ital-3222 179 3 n n VERB ital-3222 179 4 i i PRON ital-3222 179 5 k k X ital-3222 179 6 e e PROPN ital-3222 180 1 p p PROPN ital-3222 180 2 / / SYM ital-3222 180 3 re re PROPN ital-3222 180 4 s s VERB ital-3222 180 5 e e X ital-3222 180 6 a a DET ital-3222 180 7 rc rc PROPN ital-3222 180 8 h h PROPN ital-3222 180 9 / / SYM ital-3222 180 10 c c NOUN ital-3222 180 11 t t PROPN ital-3222 180 12 d d PROPN ital-3222 180 13 l l NOUN ital-3222 180 14 / / SYM ital-3222 180 15 ctdl09.zip ctdl09.zip VERB ital-3222 180 16 . . PUNCT ital-3222 181 1 although although SCONJ ital-3222 181 2 the the DET ital-3222 181 3 scheme scheme NOUN ital-3222 181 4 and and CCONJ ital-3222 181 5 the the DET ital-3222 181 6 library library NOUN ital-3222 181 7 are be AUX ital-3222 181 8 now now ADV ital-3222 181 9 complete complete ADJ ital-3222 181 10 , , PUNCT ital-3222 181 11 we we PRON ital-3222 181 12 plan plan VERB ital-3222 181 13 future future ADJ ital-3222 181 14 extensions extension NOUN ital-3222 181 15 aiming aim VERB ital-3222 181 16 both both PRON ital-3222 181 17 to to PART ital-3222 181 18 increase increase VERB ital-3222 181 19 the the DET ital-3222 181 20 level level NOUN ital-3222 181 21 of of ADP ital-3222 181 22 specializations specialization NOUN ital-3222 181 23 for for ADP ital-3222 181 24 currently currently ADV ital-3222 181 25 handled handle VERB ital-3222 181 26 document document NOUN ital-3222 181 27 formats format NOUN ital-3222 181 28 and and CCONJ ital-3222 181 29 to to PART ital-3222 181 30 extend extend VERB ital-3222 181 31 the the DET ital-3222 181 32 list list NOUN ital-3222 181 33 of of ADP ital-3222 181 34 handled handle VERB ital-3222 181 35 document document NOUN ital-3222 181 36 formats format NOUN ital-3222 181 37 . . PUNCT ital-3222 182 1 table table NOUN ital-3222 182 2 10 10 NUM ital-3222 182 3 . . PUNCT ital-3222 182 4 compression compression NOUN ital-3222 182 5 efficiency efficiency NOUN ital-3222 182 6 and and CCONJ ital-3222 182 7 times time NOUN ital-3222 182 8 for for ADP ital-3222 182 9 the the DET ital-3222 182 10 ( ( PUNCT ital-3222 182 11 uncompressed uncompressed ADJ ital-3222 182 12 ) ) PUNCT ital-3222 182 13 pdf pdf NOUN ital-3222 182 14 documents document NOUN ital-3222 182 15 deflate deflate VERB ital-3222 182 16 lzma lzma PROPN ital-3222 182 17 bzip2 bzip2 NOUN ital-3222 182 18 ppmvc ppmvc NOUN ital-3222 182 19 file file NOUN ital-3222 182 20 name name NOUN ital-3222 182 21 native native ADJ ital-3222 182 22 gzip gzip PROPN ital-3222 182 23 ctdl ctdl NOUN ital-3222 183 1 ctdl+ ctdl+ X ital-3222 183 2 7 7 NUM ital-3222 183 3 - - PUNCT ital-3222 183 4 zip zip NOUN ital-3222 183 5 ctdl ctdl NOUN ital-3222 183 6 ctdl+ ctdl+ X ital-3222 183 7 13601 13601 NUM ital-3222 183 8 - - PUNCT ital-3222 183 9 t t NOUN ital-3222 183 10 3.443 3.443 NUM ital-3222 183 11 2.624 2.624 NUM ital-3222 183 12 2.191 2.191 NUM ital-3222 183 13 2.200 2.200 NUM ital-3222 183 14 1.986 1.986 NUM ital-3222 183 15 1.708 1.708 NUM ital-3222 183 16 1.656 1.656 NUM ital-3222 183 17 1.852 1.852 NUM ital-3222 183 18 1.659 1.659 NUM ital-3222 183 19 16514 16514 NUM ital-3222 183 20 - - PUNCT ital-3222 183 21 t t NOUN ital-3222 183 22 4.370 4.370 NUM ital-3222 183 23 2.839 2.839 NUM ital-3222 183 24 2.836 2.836 NUM ital-3222 183 25 2.810 2.810 NUM ital-3222 183 26 2.422 2.422 NUM ital-3222 183 27 2.422 2.422 NUM ital-3222 183 28 2.328 2.328 NUM ital-3222 183 29 2.378 2.378 NUM ital-3222 183 30 2.241 2.241 NUM ital-3222 183 31 1noam10 1noam10 NUM ital-3222 183 32 t t NOUN ital-3222 183 33 3.379 3.379 NUM ital-3222 183 34 2.522 2.522 NUM ital-3222 183 35 2.103 2.103 NUM ital-3222 183 36 2.094 2.094 NUM ital-3222 183 37 1.924 1.924 NUM ital-3222 183 38 1.659 1.659 NUM ital-3222 183 39 1.603 1.603 NUM ital-3222 183 40 1.770 1.770 NUM ital-3222 183 41 1.587 1.587 NUM ital-3222 183 42 2ws2610 2ws2610 NUM ital-3222 183 43 3.519 3.519 NUM ital-3222 183 44 2.204 2.204 NUM ital-3222 183 45 2.346 2.346 NUM ital-3222 183 46 2.248 2.248 NUM ital-3222 183 47 1.781 1.781 NUM ital-3222 183 48 1.947 1.947 NUM ital-3222 183 49 1.860 1.860 NUM ital-3222 183 50 1.625 1.625 NUM ital-3222 183 51 1.480 1.480 NUM ital-3222 183 52 alice30 alice30 NOUN ital-3222 183 53 3.886 3.886 NUM ital-3222 183 54 2.863 2.863 NUM ital-3222 183 55 2.753 2.753 NUM ital-3222 183 56 2.668 2.668 NUM ital-3222 183 57 2.429 2.429 NUM ital-3222 183 58 2.308 2.308 NUM ital-3222 183 59 2.216 2.216 NUM ital-3222 183 60 2.315 2.315 NUM ital-3222 183 61 2.137 2.137 NUM ital-3222 183 62 cdscs10 cdscs10 ADJ ital-3222 183 63 t t NOUN ital-3222 183 64 3.684 3.684 NUM ital-3222 183 65 2.835 2.835 NUM ital-3222 183 66 2.688 2.688 NUM ital-3222 183 67 2.557 2.557 NUM ital-3222 183 68 2.399 2.399 NUM ital-3222 183 69 2.276 2.276 NUM ital-3222 183 70 2.164 2.164 NUM ital-3222 183 71 2.260 2.260 NUM ital-3222 183 72 2.079 2.079 NUM ital-3222 183 73 grimm10 grimm10 NOUN ital-3222 183 74 t t NOUN ital-3222 183 75 3.543 3.543 NUM ital-3222 183 76 2.557 2.557 NUM ital-3222 183 77 2.135 2.135 NUM ital-3222 183 78 2.120 2.120 NUM ital-3222 183 79 2.008 2.008 NUM ital-3222 183 80 1.713 1.713 NUM ital-3222 183 81 1.661 1.661 NUM ital-3222 183 82 1.858 1.858 NUM ital-3222 183 83 1.696 1.696 NUM ital-3222 183 84 pandp12 pandp12 NOUN ital-3222 183 85 t t NOUN ital-3222 184 1 3.552 3.552 NUM ital-3222 184 2 2.684 2.684 NUM ital-3222 184 3 2.267 2.267 NUM ital-3222 184 4 2.256 2.256 NUM ital-3222 184 5 2.071 2.071 NUM ital-3222 184 6 1.831 1.831 NUM ital-3222 184 7 1.769 1.769 NUM ital-3222 184 8 1.870 1.870 NUM ital-3222 184 9 1.705 1.705 NUM ital-3222 184 10 average average NOUN ital-3222 184 11 3.672 3.672 NUM ital-3222 184 12 2.641 2.641 NUM ital-3222 184 13 2.415 2.415 NUM ital-3222 184 14 2.369 2.369 NUM ital-3222 184 15 2.128 2.128 NUM ital-3222 184 16 1.983 1.983 NUM ital-3222 184 17 1.907 1.907 NUM ital-3222 184 18 1.991 1.991 NUM ital-3222 184 19 1.823 1.823 NUM ital-3222 184 20 comp comp NOUN ital-3222 184 21 . . PUNCT ital-3222 185 1 time time PROPN ital-3222 185 2 n n PROPN ital-3222 185 3 / / SYM ital-3222 185 4 a a DET ital-3222 185 5 1.594 1.594 NUM ital-3222 185 6 3.672 3.672 NUM ital-3222 185 7 3.250 3.250 NUM ital-3222 185 8 19.62 19.62 NUM ital-3222 185 9 13.31 13.31 NUM ital-3222 185 10 16.32 16.32 NUM ital-3222 185 11 5.641 5.641 NUM ital-3222 185 12 7.375 7.375 NUM ital-3222 185 13 dec dec PROPN ital-3222 185 14 . . PROPN ital-3222 185 15 time time PROPN ital-3222 185 16 n n PROPN ital-3222 185 17 / / SYM ital-3222 185 18 a a DET ital-3222 185 19 0.219 0.219 NUM ital-3222 185 20 0.844 0.844 NUM ital-3222 185 21 0.969 0.969 NUM ital-3222 185 22 0.719 0.719 NUM ital-3222 185 23 1.219 1.219 NUM ital-3222 185 24 1.360 1.360 NUM ital-3222 185 25 1.765 1.765 NUM ital-3222 185 26 7.859 7.859 NUM ital-3222 185 27 figure figure NOUN ital-3222 185 28 1 1 NUM ital-3222 185 29 . . PROPN ital-3222 185 30 compression compression NOUN ital-3222 185 31 improvement improvement NOUN ital-3222 185 32 relative relative ADJ ital-3222 185 33 to to AUX ital-3222 185 34 gzip gzip VERB ital-3222 185 35 the the DET ital-3222 185 36 efficient efficient ADJ ital-3222 185 37 storage storage NOUN ital-3222 185 38 of of ADP ital-3222 185 39 text text NOUN ital-3222 185 40 documents document NOUN ital-3222 185 41 in in ADP ital-3222 185 42 digital digital ADJ ital-3222 185 43 libraries library NOUN ital-3222 185 44 | | NOUN ital-3222 185 45 skibiński skibiński NOUN ital-3222 185 46 and and CCONJ ital-3222 185 47 swacha swacha PROPN ital-3222 185 48 153 153 NUM ital-3222 185 49 acknowledgements acknowledgement NOUN ital-3222 185 50 szymon szymon PROPN ital-3222 185 51 grabowski grabowski PROPN ital-3222 185 52 is be AUX ital-3222 185 53 the the DET ital-3222 185 54 coauthor coauthor NOUN ital-3222 185 55 of of ADP ital-3222 185 56 the the DET ital-3222 185 57 xml xml NOUN ital-3222 185 58 - - PUNCT ital-3222 185 59 wrt wrt NOUN ital-3222 185 60 transform transform NOUN ital-3222 185 61 , , PUNCT ital-3222 185 62 which which PRON ital-3222 185 63 served serve VERB ital-3222 185 64 as as ADP ital-3222 185 65 the the DET ital-3222 185 66 basis basis NOUN ital-3222 185 67 for for ADP ital-3222 185 68 the the DET ital-3222 185 69 ctdl ctdl PROPN ital-3222 185 70 library library PROPN ital-3222 185 71 . . PUNCT ital-3222 186 1 references reference NOUN ital-3222 186 2 1 1 NUM ital-3222 186 3 . . PUNCT ital-3222 186 4 john john PROPN ital-3222 186 5 f. f. PROPN ital-3222 186 6 gantz gantz PROPN ital-3222 186 7 et et PROPN ital-3222 186 8 al al PROPN ital-3222 186 9 . . PROPN ital-3222 186 10 , , PUNCT ital-3222 186 11 the the DET ital-3222 186 12 diverse diverse ADJ ital-3222 186 13 and and CCONJ ital-3222 186 14 exploding explode VERB ital-3222 186 15 digital digital ADJ ital-3222 186 16 universe universe NOUN ital-3222 186 17 : : PUNCT ital-3222 186 18 an an DET ital-3222 186 19 updated update VERB ital-3222 186 20 forecast forecast NOUN ital-3222 186 21 of of ADP ital-3222 186 22 worldwide worldwide ADJ ital-3222 186 23 information information NOUN ital-3222 186 24 growth growth NOUN ital-3222 186 25 through through ADP ital-3222 186 26 2011 2011 NUM ital-3222 186 27 ( ( PUNCT ital-3222 186 28 framingham framingham PROPN ital-3222 186 29 , , PUNCT ital-3222 186 30 mass mass PROPN ital-3222 186 31 . . PROPN ital-3222 186 32 : : PUNCT ital-3222 186 33 idc idc PROPN ital-3222 186 34 , , PUNCT ital-3222 186 35 2008 2008 NUM ital-3222 186 36 ) ) PUNCT ital-3222 186 37 , , PUNCT ital-3222 186 38 http://www http://www PROPN ital-3222 186 39 .emc.com .emc.com NOUN ital-3222 186 40 / / SYM ital-3222 186 41 collateral collateral NOUN ital-3222 186 42 / / SYM ital-3222 186 43 analyst analyst NOUN ital-3222 186 44 - - PUNCT ital-3222 186 45 reports report NOUN ital-3222 186 46 / / SYM ital-3222 186 47 diverse diverse NOUN ital-3222 186 48 - - PUNCT ital-3222 186 49 exploding explode VERB ital-3222 186 50 - - PUNCT ital-3222 186 51 digital digital NOUN ital-3222 186 52 -universe.pdf -universe.pdf PUNCT ital-3222 186 53 ( ( PUNCT ital-3222 186 54 accessed access VERB ital-3222 186 55 may may PROPN ital-3222 186 56 7 7 NUM ital-3222 186 57 , , PUNCT ital-3222 186 58 2009 2009 NUM ital-3222 186 59 ) ) PUNCT ital-3222 186 60 . . PUNCT ital-3222 187 1 2 2 X ital-3222 187 2 . . PROPN ital-3222 187 3 timothy timothy PROPN ital-3222 187 4 c. c. PROPN ital-3222 187 5 bell bell PROPN ital-3222 187 6 , , PUNCT ital-3222 187 7 alistair alistair NOUN ital-3222 187 8 moffat moffat NOUN ital-3222 187 9 , , PUNCT ital-3222 187 10 and and CCONJ ital-3222 187 11 ian ian PROPN ital-3222 187 12 h. h. PROPN ital-3222 187 13 witten witten PROPN ital-3222 187 14 , , PUNCT ital-3222 187 15 “ " PUNCT ital-3222 187 16 compressing compress VERB ital-3222 187 17 the the DET ital-3222 187 18 digital digital ADJ ital-3222 187 19 library library NOUN ital-3222 187 20 , , PUNCT ital-3222 187 21 ” " PUNCT ital-3222 187 22 in in ADP ital-3222 187 23 proceedings proceeding NOUN ital-3222 187 24 of of ADP ital-3222 187 25 digital digital ADJ ital-3222 187 26 libraries library NOUN ital-3222 187 27 ‘ ' PUNCT ital-3222 187 28 94 94 NUM ital-3222 187 29 ( ( PUNCT ital-3222 187 30 college college NOUN ital-3222 187 31 station station NOUN ital-3222 187 32 : : PUNCT ital-3222 187 33 texas texas PROPN ital-3222 187 34 a&m a&m PROPN ital-3222 187 35 univ univ PROPN ital-3222 187 36 . . PROPN ital-3222 187 37 1994 1994 NUM ital-3222 187 38 ): ): PUNCT ital-3222 187 39 41 41 NUM ital-3222 187 40 . . NUM ital-3222 187 41 3 3 NUM ital-3222 187 42 . . X ital-3222 187 43 ian ian PROPN ital-3222 187 44 h. h. PROPN ital-3222 187 45 witten witten PROPN ital-3222 187 46 and and CCONJ ital-3222 187 47 david david PROPN ital-3222 187 48 bainbridge bainbridge PROPN ital-3222 187 49 , , PUNCT ital-3222 187 50 how how SCONJ ital-3222 187 51 to to PART ital-3222 187 52 build build VERB ital-3222 187 53 a a DET ital-3222 187 54 digital digital ADJ ital-3222 187 55 library library NOUN ital-3222 187 56 ( ( PUNCT ital-3222 187 57 san san PROPN ital-3222 187 58 francisco francisco PROPN ital-3222 187 59 : : PUNCT ital-3222 187 60 morgan morgan PROPN ital-3222 187 61 kaufmann kaufmann PROPN ital-3222 187 62 , , PUNCT ital-3222 187 63 2002 2002 NUM ital-3222 187 64 ) ) PUNCT ital-3222 187 65 . . PUNCT ital-3222 188 1 4 4 X ital-3222 188 2 . . X ital-3222 188 3 chad chad PROPN ital-3222 188 4 m. m. NOUN ital-3222 188 5 kahl kahl PROPN ital-3222 188 6 and and CCONJ ital-3222 188 7 sarah sarah PROPN ital-3222 188 8 c. c. PROPN ital-3222 188 9 williams williams PROPN ital-3222 188 10 , , PUNCT ital-3222 188 11 “ " PUNCT ital-3222 188 12 accessing access VERB ital-3222 188 13 digital digital ADJ ital-3222 188 14 libraries library NOUN ital-3222 188 15 : : PUNCT ital-3222 188 16 a a DET ital-3222 188 17 study study NOUN ital-3222 188 18 of of ADP ital-3222 188 19 arl arl PROPN ital-3222 188 20 members members PROPN ital-3222 188 21 ’ ’ PART ital-3222 188 22 digital digital ADJ ital-3222 188 23 projects project NOUN ital-3222 188 24 , , PUNCT ital-3222 188 25 ” " PUNCT ital-3222 188 26 the the DET ital-3222 188 27 journal journal NOUN ital-3222 188 28 of of ADP ital-3222 188 29 academic academic ADJ ital-3222 188 30 librarianship librarianship NOUN ital-3222 188 31 32 32 NUM ital-3222 188 32 , , PUNCT ital-3222 188 33 no no NOUN ital-3222 188 34 . . NOUN ital-3222 188 35 4 4 NUM ital-3222 188 36 ( ( PUNCT ital-3222 188 37 2006 2006 NUM ital-3222 188 38 ): ): PUNCT ital-3222 188 39 364 364 NUM ital-3222 188 40 . . NUM ital-3222 188 41 5 5 NUM ital-3222 188 42 . . X ital-3222 188 43 donald donald PROPN ital-3222 188 44 e. e. PROPN ital-3222 188 45 knuth knuth PROPN ital-3222 188 46 , , PUNCT ital-3222 188 47 tex tex PROPN ital-3222 188 48 : : PUNCT ital-3222 188 49 the the DET ital-3222 188 50 program program NOUN ital-3222 188 51 ( ( PUNCT ital-3222 188 52 reading reading NOUN ital-3222 188 53 , , PUNCT ital-3222 188 54 mass mass PROPN ital-3222 188 55 . . PROPN ital-3222 188 56 : : PUNCT ital-3222 188 57 addison addison PROPN ital-3222 188 58 - - PUNCT ital-3222 188 59 wesley wesley PROPN ital-3222 188 60 , , PUNCT ital-3222 188 61 1986 1986 NUM ital-3222 188 62 ) ) PUNCT ital-3222 188 63 ; ; PUNCT ital-3222 188 64 microsoft microsoft PROPN ital-3222 188 65 technical technical ADJ ital-3222 188 66 support support NOUN ital-3222 188 67 , , PUNCT ital-3222 188 68 rich rich ADJ ital-3222 188 69 text text NOUN ital-3222 188 70 format format NOUN ital-3222 188 71 ( ( PUNCT ital-3222 188 72 rtf rtf ADJ ital-3222 188 73 ) ) PUNCT ital-3222 188 74 version version NOUN ital-3222 188 75 1.5 1.5 NUM ital-3222 188 76 specification specification NOUN ital-3222 188 77 , , PUNCT ital-3222 188 78 1997 1997 NUM ital-3222 188 79 , , PUNCT ital-3222 188 80 http://www.biblioscape http://www.biblioscape NOUN ital-3222 188 81 .com .com PUNCT ital-3222 188 82 / / SYM ital-3222 188 83 rtf15_spec.htm rtf15_spec.htm PROPN ital-3222 188 84 ( ( PUNCT ital-3222 188 85 accessed access VERB ital-3222 188 86 may may PROPN ital-3222 188 87 7 7 NUM ital-3222 188 88 , , PUNCT ital-3222 188 89 2009 2009 NUM ital-3222 188 90 ) ) PUNCT ital-3222 188 91 ; ; PUNCT ital-3222 188 92 tim tim PROPN ital-3222 188 93 bray bray NOUN ital-3222 188 94 et et PROPN ital-3222 189 1 al al PROPN ital-3222 189 2 . . PROPN ital-3222 189 3 , , PUNCT ital-3222 189 4 eds ed NOUN ital-3222 189 5 . . PROPN ital-3222 189 6 , , PUNCT ital-3222 189 7 extensible extensible ADJ ital-3222 189 8 markup markup PROPN ital-3222 189 9 language language NOUN ital-3222 189 10 ( ( PUNCT ital-3222 189 11 xml xml NOUN ital-3222 189 12 ) ) PUNCT ital-3222 189 13 1.0 1.0 NUM ital-3222 189 14 ( ( PUNCT ital-3222 189 15 fourth fourth ADJ ital-3222 189 16 edition edition NOUN ital-3222 189 17 ) ) PUNCT ital-3222 189 18 , , PUNCT ital-3222 189 19 2006 2006 NUM ital-3222 189 20 , , PUNCT ital-3222 189 21 http://www.w3.org/tr/2006/rec-xml-20060816 http://www.w3.org/tr/2006/rec-xml-20060816 PROPN ital-3222 189 22 ( ( PUNCT ital-3222 189 23 accessed access VERB ital-3222 189 24 may may PROPN ital-3222 189 25 7 7 NUM ital-3222 189 26 , , PUNCT ital-3222 189 27 2009 2009 NUM ital-3222 189 28 ) ) PUNCT ital-3222 189 29 ; ; PUNCT ital-3222 189 30 dave dave PROPN ital-3222 189 31 raggett raggett PROPN ital-3222 189 32 , , PUNCT ital-3222 189 33 arnaud arnaud PROPN ital-3222 189 34 le le X ital-3222 189 35 hors hor NOUN ital-3222 189 36 , , PUNCT ital-3222 189 37 and and CCONJ ital-3222 189 38 ian ian ADJ ital-3222 189 39 jacobs jacobs PROPN ital-3222 189 40 , , PUNCT ital-3222 189 41 eds ed NOUN ital-3222 189 42 . . PUNCT ital-3222 189 43 , , PUNCT ital-3222 189 44 w3c w3c PROPN ital-3222 189 45 html html PROPN ital-3222 189 46 4.01 4.01 NUM ital-3222 189 47 specification specification NOUN ital-3222 189 48 , , PUNCT ital-3222 189 49 1999 1999 NUM ital-3222 189 50 , , PUNCT ital-3222 189 51 http://www.w3.org/ http://www.w3.org/ VERB ital-3222 189 52 tr tr VERB ital-3222 189 53 / / SYM ital-3222 189 54 rec rec NOUN ital-3222 189 55 - - PUNCT ital-3222 189 56 html40/ html40/ X ital-3222 189 57 ( ( PUNCT ital-3222 189 58 accessed access VERB ital-3222 189 59 may may PROPN ital-3222 189 60 7 7 NUM ital-3222 189 61 , , PUNCT ital-3222 189 62 2009 2009 NUM ital-3222 189 63 ) ) PUNCT ital-3222 189 64 ; ; PUNCT ital-3222 189 65 postscript postscript ADJ ital-3222 189 66 language language NOUN ital-3222 189 67 reference reference NOUN ital-3222 189 68 , , PUNCT ital-3222 189 69 3rd 3rd NOUN ital-3222 189 70 ed ed NOUN ital-3222 189 71 . . PUNCT ital-3222 190 1 ( ( PUNCT ital-3222 190 2 reading reading NOUN ital-3222 190 3 , , PUNCT ital-3222 190 4 mass mass PROPN ital-3222 190 5 . . PROPN ital-3222 190 6 : : PUNCT ital-3222 190 7 addison addison PROPN ital-3222 190 8 - - PUNCT ital-3222 190 9 wesley wesley PROPN ital-3222 190 10 , , PUNCT ital-3222 190 11 1999 1999 NUM ital-3222 190 12 ) ) PUNCT ital-3222 190 13 , , PUNCT ital-3222 190 14 http://www.adobe.com/devnet/postscript/pdfs/plrm.pdf http://www.adobe.com/devnet/postscript/pdfs/plrm.pdf NOUN ital-3222 190 15 ( ( PUNCT ital-3222 190 16 accessed access VERB ital-3222 190 17 may may PROPN ital-3222 190 18 7 7 NUM ital-3222 190 19 , , PUNCT ital-3222 190 20 2009 2009 NUM ital-3222 190 21 ) ) PUNCT ital-3222 190 22 ; ; PUNCT ital-3222 190 23 pdf pdf NOUN ital-3222 190 24 reference reference NOUN ital-3222 190 25 , , PUNCT ital-3222 190 26 6th 6th ADJ ital-3222 190 27 ed ed NOUN ital-3222 190 28 . . PROPN ital-3222 190 29 , , PUNCT ital-3222 190 30 version version NOUN ital-3222 190 31 1.7 1.7 NUM ital-3222 190 32 , , PUNCT ital-3222 190 33 2006 2006 NUM ital-3222 190 34 , , PUNCT ital-3222 190 35 http://www.adobe.com/devnet/acrobat/pdfs/pdf http://www.adobe.com/devnet/acrobat/pdfs/pdf NOUN ital-3222 190 36 _ _ NOUN ital-3222 190 37 reference_1-7.pdf reference_1-7.pdf NOUN ital-3222 190 38 ( ( PUNCT ital-3222 190 39 accessed access VERB ital-3222 190 40 may may PROPN ital-3222 190 41 7 7 NUM ital-3222 190 42 , , PUNCT ital-3222 190 43 2009 2009 NUM ital-3222 190 44 ) ) PUNCT ital-3222 190 45 . . PUNCT ital-3222 191 1 6 6 X ital-3222 191 2 . . X ital-3222 191 3 jacob jacob PROPN ital-3222 191 4 ziv ziv PROPN ital-3222 191 5 and and CCONJ ital-3222 191 6 abraham abraham PROPN ital-3222 191 7 lempel lempel PROPN ital-3222 191 8 , , PUNCT ital-3222 191 9 “ " PUNCT ital-3222 191 10 a a DET ital-3222 191 11 universal universal ADJ ital-3222 191 12 algorithm algorithm NOUN ital-3222 191 13 for for ADP ital-3222 191 14 sequential sequential ADJ ital-3222 191 15 data datum NOUN ital-3222 191 16 compression compression NOUN ital-3222 191 17 , , PUNCT ital-3222 191 18 ” " PUNCT ital-3222 191 19 ieee ieee NOUN ital-3222 191 20 transactions transaction NOUN ital-3222 191 21 on on ADP ital-3222 191 22 information information NOUN ital-3222 191 23 theory theory NOUN ital-3222 191 24 23 23 NUM ital-3222 191 25 , , PUNCT ital-3222 191 26 no no INTJ ital-3222 191 27 . . NOUN ital-3222 191 28 3 3 NUM ital-3222 191 29 ( ( PUNCT ital-3222 191 30 1977 1977 NUM ital-3222 191 31 ): ): PUNCT ital-3222 191 32 337 337 NUM ital-3222 191 33 . . NUM ital-3222 191 34 7 7 NUM ital-3222 191 35 . . PROPN ital-3222 191 36 ian ian PROPN ital-3222 191 37 h. h. PROPN ital-3222 191 38 witten witten PROPN ital-3222 191 39 , , PUNCT ital-3222 191 40 alistair alistair NOUN ital-3222 191 41 moffat moffat NOUN ital-3222 191 42 , , PUNCT ital-3222 191 43 and and CCONJ ital-3222 191 44 timothy timothy PROPN ital-3222 191 45 c. c. PROPN ital-3222 191 46 bell bell PROPN ital-3222 191 47 , , PUNCT ital-3222 191 48 managing managing NOUN ital-3222 191 49 gigabytes gigabyte NOUN ital-3222 191 50 : : PUNCT ital-3222 191 51 compressing compressing ADJ ital-3222 191 52 and and CCONJ ital-3222 191 53 indexing indexing NOUN ital-3222 191 54 documents document NOUN ital-3222 191 55 and and CCONJ ital-3222 191 56 images image NOUN ital-3222 191 57 , , PUNCT ital-3222 191 58 2nd 2nd NOUN ital-3222 191 59 ed ed PROPN ital-3222 191 60 . . PUNCT ital-3222 192 1 ( ( PUNCT ital-3222 192 2 san san PROPN ital-3222 192 3 francisco francisco PROPN ital-3222 192 4 : : PUNCT ital-3222 192 5 morgan morgan PROPN ital-3222 192 6 kaufmann kaufmann PROPN ital-3222 192 7 , , PUNCT ital-3222 192 8 1999 1999 NUM ital-3222 192 9 ) ) PUNCT ital-3222 192 10 . . PUNCT ital-3222 193 1 8 8 X ital-3222 193 2 . . X ital-3222 193 3 john john PROPN ital-3222 193 4 g. g. PROPN ital-3222 193 5 cleary cleary PROPN ital-3222 193 6 and and CCONJ ital-3222 193 7 ian ian PROPN ital-3222 193 8 h. h. PROPN ital-3222 193 9 witten witten PROPN ital-3222 193 10 , , PUNCT ital-3222 193 11 “ " PUNCT ital-3222 193 12 data data NOUN ital-3222 193 13 compression compression NOUN ital-3222 193 14 using use VERB ital-3222 193 15 adaptive adaptive ADJ ital-3222 193 16 coding code VERB ital-3222 193 17 and and CCONJ ital-3222 193 18 partial partial ADJ ital-3222 193 19 string string NOUN ital-3222 193 20 matching matching NOUN ital-3222 193 21 , , PUNCT ital-3222 193 22 ” " PUNCT ital-3222 193 23 ieee ieee NOUN ital-3222 193 24 transactions transaction NOUN ital-3222 193 25 on on ADP ital-3222 193 26 communication communication NOUN ital-3222 193 27 32 32 NUM ital-3222 193 28 , , PUNCT ital-3222 193 29 no no NOUN ital-3222 193 30 . . NOUN ital-3222 193 31 4 4 NUM ital-3222 193 32 , , PUNCT ital-3222 193 33 ( ( PUNCT ital-3222 193 34 1984 1984 NUM ital-3222 193 35 ): ): PUNCT ital-3222 193 36 396 396 NUM ital-3222 193 37 ; ; PUNCT ital-3222 193 38 michael michael PROPN ital-3222 193 39 burrows burrow NOUN ital-3222 193 40 and and CCONJ ital-3222 193 41 david david PROPN ital-3222 193 42 j. j. PROPN ital-3222 193 43 wheeler wheeler PROPN ital-3222 193 44 , , PUNCT ital-3222 193 45 “ " PUNCT ital-3222 193 46 a a DET ital-3222 193 47 block block NOUN ital-3222 193 48 - - PUNCT ital-3222 193 49 sorting sort VERB ital-3222 193 50 lossless lossless NOUN ital-3222 193 51 data datum NOUN ital-3222 193 52 compression compression NOUN ital-3222 193 53 algorithm algorithm NOUN ital-3222 193 54 , , PUNCT ital-3222 193 55 ” " PUNCT ital-3222 193 56 digital digital ADJ ital-3222 193 57 equipment equipment NOUN ital-3222 193 58 corporation corporation PROPN ital-3222 193 59 src src PROPN ital-3222 193 60 research research PROPN ital-3222 193 61 report report NOUN ital-3222 193 62 124 124 NUM ital-3222 193 63 , , PUNCT ital-3222 193 64 1994 1994 NUM ital-3222 193 65 , , PUNCT ital-3222 193 66 www.hpl.hp.com/techreports/ www.hpl.hp.com/techreports/ PROPN ital-3222 193 67 compaq compaq PROPN ital-3222 193 68 - - PUNCT ital-3222 193 69 dec dec PROPN ital-3222 193 70 / / SYM ital-3222 193 71 src src NOUN ital-3222 193 72 - - PUNCT ital-3222 193 73 rr-124.pdf rr-124.pdf NOUN ital-3222 193 74 ( ( PUNCT ital-3222 193 75 accessed access VERB ital-3222 193 76 may may PROPN ital-3222 193 77 7 7 NUM ital-3222 193 78 , , PUNCT ital-3222 193 79 2009 2009 NUM ital-3222 193 80 ) ) PUNCT ital-3222 193 81 . . PUNCT ital-3222 194 1 9 9 X ital-3222 194 2 . . X ital-3222 195 1 witten witten PROPN ital-3222 195 2 , , PUNCT ital-3222 195 3 moffat moffat NOUN ital-3222 195 4 , , PUNCT ital-3222 195 5 and and CCONJ ital-3222 195 6 bell bell NOUN ital-3222 195 7 , , PUNCT ital-3222 195 8 managing managing NOUN ital-3222 195 9 gigabytes gigabyte NOUN ital-3222 195 10 . . PUNCT ital-3222 196 1 10 10 X ital-3222 196 2 . . PUNCT ital-3222 197 1 jon jon PROPN ital-3222 197 2 louis louis PROPN ital-3222 197 3 bentley bentley PROPN ital-3222 197 4 et et PROPN ital-3222 197 5 al al PROPN ital-3222 197 6 . . PROPN ital-3222 197 7 , , PUNCT ital-3222 197 8 “ " PUNCT ital-3222 197 9 a a DET ital-3222 197 10 locally locally ADV ital-3222 197 11 adaptive adaptive ADJ ital-3222 197 12 data datum NOUN ital-3222 197 13 compression compression NOUN ital-3222 197 14 scheme scheme NOUN ital-3222 197 15 , , PUNCT ital-3222 197 16 ” " PUNCT ital-3222 197 17 communications communication NOUN ital-3222 197 18 of of ADP ital-3222 197 19 the the DET ital-3222 197 20 acm acm NOUN ital-3222 197 21 29 29 NUM ital-3222 197 22 , , PUNCT ital-3222 197 23 no no NOUN ital-3222 197 24 . . NOUN ital-3222 197 25 4 4 NUM ital-3222 197 26 ( ( PUNCT ital-3222 197 27 1986 1986 NUM ital-3222 197 28 ): ): PUNCT ital-3222 197 29 320 320 NUM ital-3222 197 30 ; ; PUNCT ital-3222 197 31 r. r. PROPN ital-3222 197 32 nigel nigel PROPN ital-3222 197 33 horspool horspool PROPN ital-3222 197 34 and and CCONJ ital-3222 197 35 gordon gordon PROPN ital-3222 197 36 v. v. ADP ital-3222 197 37 cormack cormack PROPN ital-3222 197 38 , , PUNCT ital-3222 197 39 “ " PUNCT ital-3222 197 40 constructing construct VERB ital-3222 197 41 word word NOUN ital-3222 197 42 - - PUNCT ital-3222 197 43 based base VERB ital-3222 197 44 text text NOUN ital-3222 197 45 compression compression NOUN ital-3222 197 46 algorithms algorithm NOUN ital-3222 197 47 , , PUNCT ital-3222 197 48 ” " PUNCT ital-3222 197 49 proceedings proceeding NOUN ital-3222 197 50 of of ADP ital-3222 197 51 the the DET ital-3222 197 52 data data NOUN ital-3222 197 53 compression compression NOUN ital-3222 197 54 conference conference NOUN ital-3222 197 55 ( ( PUNCT ital-3222 197 56 snowbird snowbird PROPN ital-3222 197 57 , , PUNCT ital-3222 197 58 utah utah PROPN ital-3222 197 59 , , PUNCT ital-3222 197 60 1992 1992 NUM ital-3222 197 61 ): ): PUNCT ital-3222 197 62 62 62 NUM ital-3222 197 63 . . NUM ital-3222 197 64 11 11 NUM ital-3222 197 65 . . PUNCT ital-3222 198 1 see see VERB ital-3222 198 2 for for ADP ital-3222 198 3 example example NOUN ital-3222 198 4 andrei andrei NOUN ital-3222 198 5 v. v. ADP ital-3222 198 6 kadach kadach PROPN ital-3222 198 7 , , PUNCT ital-3222 198 8 “ " PUNCT ital-3222 198 9 text text NOUN ital-3222 198 10 and and CCONJ ital-3222 198 11 hypertext hypertext NOUN ital-3222 198 12 compression compression NOUN ital-3222 198 13 , , PUNCT ital-3222 198 14 ” " PUNCT ital-3222 198 15 programming programming NOUN ital-3222 198 16 & & CCONJ ital-3222 198 17 computer computer NOUN ital-3222 198 18 software software NOUN ital-3222 198 19 23 23 NUM ital-3222 198 20 , , PUNCT ital-3222 198 21 no no NOUN ital-3222 198 22 . . NOUN ital-3222 198 23 4 4 NUM ital-3222 198 24 ( ( PUNCT ital-3222 198 25 1997 1997 NUM ital-3222 198 26 ): ): PUNCT ital-3222 198 27 212 212 NUM ital-3222 198 28 ; ; PUNCT ital-3222 198 29 alistair alistair NOUN ital-3222 198 30 moffat moffat NOUN ital-3222 198 31 , , PUNCT ital-3222 198 32 “ " PUNCT ital-3222 198 33 word word NOUN ital-3222 198 34 - - PUNCT ital-3222 198 35 based base VERB ital-3222 198 36 text text NOUN ital-3222 198 37 compression compression NOUN ital-3222 198 38 , , PUNCT ital-3222 198 39 ” " PUNCT ital-3222 198 40 software software NOUN ital-3222 198 41 — — PUNCT ital-3222 198 42 practice practice NOUN ital-3222 198 43 & & CCONJ ital-3222 198 44 experience experience NOUN ital-3222 198 45 2 2 NUM ital-3222 198 46 , , PUNCT ital-3222 198 47 no no NOUN ital-3222 198 48 . . NOUN ital-3222 198 49 19 19 NUM ital-3222 198 50 ( ( PUNCT ital-3222 198 51 1989 1989 NUM ital-3222 198 52 ): ): PUNCT ital-3222 198 53 185 185 NUM ital-3222 198 54 ; ; PUNCT ital-3222 198 55 przemysław przemysław ADJ ital-3222 198 56 skibiński skibiński ADJ ital-3222 198 57 , , PUNCT ital-3222 198 58 szymon szymon ADJ ital-3222 198 59 grabowski grabowski PROPN ital-3222 198 60 , , PUNCT ital-3222 198 61 and and CCONJ ital-3222 198 62 sebastian sebastian ADJ ital-3222 198 63 deorowicz deorowicz NOUN ital-3222 198 64 , , PUNCT ital-3222 198 65 “ " PUNCT ital-3222 198 66 revisiting revisit VERB ital-3222 198 67 dictionary dictionary ADJ ital-3222 198 68 - - PUNCT ital-3222 198 69 based base VERB ital-3222 198 70 compression compression NOUN ital-3222 198 71 , , PUNCT ital-3222 198 72 ” " PUNCT ital-3222 198 73 software software NOUN ital-3222 198 74 — — PUNCT ital-3222 198 75 practice practice NOUN ital-3222 198 76 & & CCONJ ital-3222 198 77 experience experience NOUN ital-3222 198 78 35 35 NUM ital-3222 198 79 , , PUNCT ital-3222 198 80 no no NOUN ital-3222 198 81 . . NOUN ital-3222 198 82 15 15 NUM ital-3222 198 83 ( ( PUNCT ital-3222 198 84 2005 2005 NUM ital-3222 198 85 ): ): PUNCT ital-3222 198 86 1455 1455 NUM ital-3222 198 87 . . PUNCT ital-3222 199 1 12 12 NUM ital-3222 199 2 . . PUNCT ital-3222 200 1 przemysław przemysław PROPN ital-3222 200 2 skibiński skibiński PROPN ital-3222 200 3 , , PUNCT ital-3222 200 4 jakub jakub PROPN ital-3222 200 5 swacha swacha PROPN ital-3222 200 6 , , PUNCT ital-3222 200 7 and and CCONJ ital-3222 200 8 szymon szymon ADJ ital-3222 200 9 grabowski grabowski PROPN ital-3222 200 10 , , PUNCT ital-3222 200 11 “ " PUNCT ital-3222 200 12 a a DET ital-3222 200 13 highly highly ADV ital-3222 200 14 efficient efficient ADJ ital-3222 200 15 xml xml NOUN ital-3222 200 16 compression compression NOUN ital-3222 200 17 scheme scheme NOUN ital-3222 200 18 for for ADP ital-3222 200 19 the the DET ital-3222 200 20 web web NOUN ital-3222 200 21 , , PUNCT ital-3222 200 22 ” " PUNCT ital-3222 200 23 proceedings proceeding NOUN ital-3222 200 24 of of ADP ital-3222 200 25 the the DET ital-3222 200 26 34th 34th ADJ ital-3222 200 27 international international ADJ ital-3222 200 28 conference conference NOUN ital-3222 200 29 on on ADP ital-3222 200 30 current current ADJ ital-3222 200 31 trends trend NOUN ital-3222 200 32 in in ADP ital-3222 200 33 theory theory NOUN ital-3222 200 34 and and CCONJ ital-3222 200 35 practice practice NOUN ital-3222 200 36 of of ADP ital-3222 200 37 computer computer NOUN ital-3222 200 38 science science NOUN ital-3222 200 39 , , PUNCT ital-3222 200 40 lncs lnc VERB ital-3222 200 41 4910 4910 NUM ital-3222 200 42 ( ( PUNCT ital-3222 200 43 2008 2008 NUM ital-3222 200 44 ): ): PUNCT ital-3222 200 45 766 766 NUM ital-3222 200 46 . . PROPN ital-3222 200 47 13 13 NUM ital-3222 200 48 . . PUNCT ital-3222 201 1 jon jon PROPN ital-3222 201 2 louis louis PROPN ital-3222 201 3 bentley bentley PROPN ital-3222 201 4 et et PROPN ital-3222 201 5 al al PROPN ital-3222 201 6 . . PROPN ital-3222 201 7 , , PUNCT ital-3222 201 8 “ " PUNCT ital-3222 201 9 a a DET ital-3222 201 10 locally locally ADV ital-3222 201 11 adaptive adaptive ADJ ital-3222 201 12 data datum NOUN ital-3222 201 13 compression compression NOUN ital-3222 201 14 scheme scheme NOUN ital-3222 201 15 , , PUNCT ital-3222 201 16 ” " PUNCT ital-3222 201 17 communications communication NOUN ital-3222 201 18 of of ADP ital-3222 201 19 the the DET ital-3222 201 20 acm acm NOUN ital-3222 201 21 29 29 NUM ital-3222 201 22 , , PUNCT ital-3222 201 23 no no NOUN ital-3222 201 24 . . NOUN ital-3222 201 25 4 4 NUM ital-3222 201 26 ( ( PUNCT ital-3222 201 27 1986 1986 NUM ital-3222 201 28 ): ): SYM ital-3222 201 29 320 320 NUM ital-3222 201 30 . . NUM ital-3222 201 31 14 14 NUM ital-3222 201 32 . . PUNCT ital-3222 202 1 skibiński skibiński ADV ital-3222 202 2 , , PUNCT ital-3222 202 3 grabowski grabowski PROPN ital-3222 202 4 , , PUNCT ital-3222 202 5 and and CCONJ ital-3222 202 6 deorowicz deorowicz NOUN ital-3222 202 7 , , PUNCT ital-3222 202 8 “ " PUNCT ital-3222 202 9 revisiting revisit VERB ital-3222 202 10 dictionary dictionary ADJ ital-3222 202 11 - - PUNCT ital-3222 202 12 based base VERB ital-3222 202 13 compression compression NOUN ital-3222 202 14 , , PUNCT ital-3222 202 15 ” " PUNCT ital-3222 202 16 1455 1455 NUM ital-3222 202 17 . . PUNCT ital-3222 203 1 15 15 NUM ital-3222 203 2 . . PUNCT ital-3222 204 1 skibiński skibiński PROPN ital-3222 204 2 , , PUNCT ital-3222 204 3 swacha swacha PROPN ital-3222 204 4 , , PUNCT ital-3222 204 5 and and CCONJ ital-3222 204 6 grabowski grabowski PROPN ital-3222 204 7 , , PUNCT ital-3222 204 8 “ " PUNCT ital-3222 204 9 a a DET ital-3222 204 10 highly highly ADV ital-3222 204 11 efficient efficient ADJ ital-3222 204 12 xml xml NOUN ital-3222 204 13 compression compression NOUN ital-3222 204 14 scheme scheme NOUN ital-3222 204 15 for for ADP ital-3222 204 16 the the DET ital-3222 204 17 web web NOUN ital-3222 204 18 , , PUNCT ital-3222 204 19 ” " PUNCT ital-3222 204 20 766 766 NUM ital-3222 204 21 . . PUNCT ital-3222 205 1 16 16 NUM ital-3222 205 2 . . PUNCT ital-3222 206 1 peter peter PROPN ital-3222 206 2 deutsch deutsch PROPN ital-3222 206 3 , , PUNCT ital-3222 206 4 “ " PUNCT ital-3222 206 5 deflate deflate ADJ ital-3222 206 6 compressed compress VERB ital-3222 206 7 data data NOUN ital-3222 206 8 format format NOUN ital-3222 206 9 specification specification NOUN ital-3222 206 10 version version NOUN ital-3222 206 11 1.3 1.3 NUM ital-3222 206 12 , , PUNCT ital-3222 206 13 ” " PUNCT ital-3222 206 14 rfc1951 rfc1951 NOUN ital-3222 206 15 , , PUNCT ital-3222 206 16 network network NOUN ital-3222 206 17 working working NOUN ital-3222 206 18 group group NOUN ital-3222 206 19 , , PUNCT ital-3222 206 20 1996 1996 NUM ital-3222 206 21 , , PUNCT ital-3222 206 22 www.ietf.org/rfc/rfc1951.txt www.ietf.org/rfc/rfc1951.txt PROPN ital-3222 206 23 ( ( PUNCT ital-3222 206 24 accessed access VERB ital-3222 206 25 may may PROPN ital-3222 206 26 7 7 NUM ital-3222 206 27 , , PUNCT ital-3222 206 28 2009 2009 NUM ital-3222 206 29 ) ) PUNCT ital-3222 206 30 . . PUNCT ital-3222 207 1 17 17 NUM ital-3222 207 2 . . PUNCT ital-3222 208 1 christian christian PROPN ital-3222 208 2 schneider schneider PROPN ital-3222 208 3 , , PUNCT ital-3222 208 4 precomp precomp NOUN ital-3222 208 5 — — PUNCT ital-3222 208 6 a a DET ital-3222 208 7 command command NOUN ital-3222 208 8 line line NOUN ital-3222 208 9 precompressor precompressor NOUN ital-3222 208 10 , , PUNCT ital-3222 208 11 2009 2009 NUM ital-3222 208 12 , , PUNCT ital-3222 208 13 http://schnaader.info/precomp.html http://schnaader.info/precomp.html PUNCT ital-3222 208 14 ( ( PUNCT ital-3222 208 15 accessed access VERB ital-3222 208 16 may may PROPN ital-3222 208 17 7 7 NUM ital-3222 208 18 , , PUNCT ital-3222 208 19 2009 2009 NUM ital-3222 208 20 ) ) PUNCT ital-3222 208 21 . . PUNCT ital-3222 209 1 18 18 NUM ital-3222 209 2 . . PUNCT ital-3222 210 1 the the DET ital-3222 210 2 technical technical ADJ ital-3222 210 3 details detail NOUN ital-3222 210 4 of of ADP ital-3222 210 5 the the DET ital-3222 210 6 algorithm algorithm NOUN ital-3222 210 7 constructing construct VERB ital-3222 210 8 code code NOUN ital-3222 210 9 words word NOUN ital-3222 210 10 and and CCONJ ital-3222 210 11 assigning assign VERB ital-3222 210 12 them they PRON ital-3222 210 13 to to ADP ital-3222 210 14 indexes index NOUN ital-3222 210 15 , , PUNCT ital-3222 210 16 and and CCONJ ital-3222 210 17 encoding encode VERB ital-3222 210 18 numbers number NOUN ital-3222 210 19 and and CCONJ ital-3222 210 20 special special ADJ ital-3222 210 21 tokens token NOUN ital-3222 210 22 , , PUNCT ital-3222 210 23 are be AUX ital-3222 210 24 given give VERB ital-3222 210 25 in in ADP ital-3222 210 26 skibiński skibiński ADJ ital-3222 210 27 , , PUNCT ital-3222 210 28 swacha swacha PROPN ital-3222 210 29 , , PUNCT ital-3222 210 30 and and CCONJ ital-3222 210 31 grabowski grabowski PROPN ital-3222 210 32 , , PUNCT ital-3222 210 33 “ " PUNCT ital-3222 210 34 a a DET ital-3222 210 35 highly highly ADV ital-3222 210 36 efficient efficient ADJ ital-3222 210 37 xml xml NOUN ital-3222 210 38 compression compression NOUN ital-3222 210 39 scheme scheme NOUN ital-3222 210 40 for for ADP ital-3222 210 41 the the DET ital-3222 210 42 web web NOUN ital-3222 210 43 , , PUNCT ital-3222 210 44 ” " PUNCT ital-3222 210 45 766 766 NUM ital-3222 210 46 . . PROPN ital-3222 210 47 19 19 NUM ital-3222 210 48 . . PUNCT ital-3222 211 1 david david PROPN ital-3222 211 2 solomon solomon PROPN ital-3222 211 3 , , PUNCT ital-3222 211 4 data data NOUN ital-3222 211 5 compression compression NOUN ital-3222 211 6 : : PUNCT ital-3222 211 7 the the DET ital-3222 211 8 complete complete ADJ ital-3222 211 9 reference reference NOUN ital-3222 211 10 , , PUNCT ital-3222 211 11 4th 4th ADJ ital-3222 211 12 ed ed NOUN ital-3222 211 13 . . PUNCT ital-3222 212 1 ( ( PUNCT ital-3222 212 2 london london PROPN ital-3222 212 3 : : PUNCT ital-3222 212 4 springer springer NOUN ital-3222 212 5 - - PUNCT ital-3222 212 6 verlag verlag PROPN ital-3222 212 7 , , PUNCT ital-3222 212 8 2006 2006 NUM ital-3222 212 9 ) ) PUNCT ital-3222 212 10 . . PUNCT ital-3222 213 1 20 20 NUM ital-3222 213 2 . . PUNCT ital-3222 214 1 skibiński skibiński PROPN ital-3222 214 2 , , PUNCT ital-3222 214 3 swacha swacha PROPN ital-3222 214 4 , , PUNCT ital-3222 214 5 and and CCONJ ital-3222 214 6 grabowski grabowski PROPN ital-3222 214 7 , , PUNCT ital-3222 214 8 “ " PUNCT ital-3222 214 9 a a DET ital-3222 214 10 highly highly ADV ital-3222 214 11 efficient efficient ADJ ital-3222 214 12 xml xml NOUN ital-3222 214 13 compression compression NOUN ital-3222 214 14 scheme scheme NOUN ital-3222 214 15 for for ADP ital-3222 214 16 the the DET ital-3222 214 17 web web NOUN ital-3222 214 18 , , PUNCT ital-3222 214 19 ” " PUNCT ital-3222 214 20 766 766 NUM ital-3222 214 21 . . PROPN ital-3222 214 22 21 21 NUM ital-3222 214 23 . . PUNCT ital-3222 215 1 dave dave PROPN ital-3222 215 2 raggett raggett PROPN ital-3222 215 3 , , PUNCT ital-3222 215 4 arnaud arnaud PROPN ital-3222 215 5 le le X ital-3222 215 6 hors hor NOUN ital-3222 215 7 , , PUNCT ital-3222 215 8 and and CCONJ ital-3222 215 9 ian ian ADJ ital-3222 215 10 jacobs jacobs PROPN ital-3222 215 11 , , PUNCT ital-3222 215 12 eds ed NOUN ital-3222 215 13 . . PUNCT ital-3222 215 14 , , PUNCT ital-3222 215 15 w3c w3c PROPN ital-3222 215 16 html html PROPN ital-3222 215 17 4.01 4.01 NUM ital-3222 215 18 specification specification NOUN ital-3222 215 19 , , PUNCT ital-3222 215 20 1999 1999 NUM ital-3222 215 21 , , PUNCT ital-3222 215 22 http://www.w3.org/tr/rec http://www.w3.org/tr/rec X ital-3222 215 23 -html40/ -html40/ PROPN ital-3222 215 24 ( ( PUNCT ital-3222 215 25 accessed access VERB ital-3222 215 26 may may PROPN ital-3222 215 27 7 7 NUM ital-3222 215 28 , , PUNCT ital-3222 215 29 2009 2009 NUM ital-3222 215 30 ) ) PUNCT ital-3222 215 31 . . PUNCT ital-3222 216 1 22 22 NUM ital-3222 216 2 . . PUNCT ital-3222 217 1 ian ian PROPN ital-3222 217 2 h. h. PROPN ital-3222 217 3 witten witten PROPN ital-3222 217 4 , , PUNCT ital-3222 217 5 david david PROPN ital-3222 217 6 bainbridge bainbridge PROPN ital-3222 217 7 , , PUNCT ital-3222 217 8 and and CCONJ ital-3222 217 9 stefan stefan PROPN ital-3222 217 10 boddie boddie VERB ital-3222 217 11 , , PUNCT ital-3222 217 12 “ " PUNCT ital-3222 217 13 greenstone greenstone NOUN ital-3222 217 14 : : PUNCT ital-3222 217 15 open open ADJ ital-3222 217 16 source source NOUN ital-3222 217 17 dl dl PROPN ital-3222 217 18 software software NOUN ital-3222 217 19 , , PUNCT ital-3222 217 20 ” " PUNCT ital-3222 217 21 communications communication NOUN ital-3222 217 22 of of ADP ital-3222 217 23 the the DET ital-3222 217 24 acm acm NOUN ital-3222 217 25 44 44 NUM ital-3222 217 26 , , PUNCT ital-3222 217 27 no no NOUN ital-3222 217 28 . . NOUN ital-3222 217 29 5 5 NUM ital-3222 217 30 ( ( PUNCT ital-3222 217 31 2001 2001 NUM ital-3222 217 32 ): ): PUNCT ital-3222 217 33 47 47 NUM ital-3222 217 34 . . PUNCT ital-3222 217 35 23 23 NUM ital-3222 217 36 . . PUNCT ital-3222 217 37 project project PROPN ital-3222 217 38 gutenberg gutenberg PROPN ital-3222 217 39 , , PUNCT ital-3222 217 40 2008 2008 NUM ital-3222 217 41 , , PUNCT ital-3222 217 42 http://www.gutenberg.org/ http://www.gutenberg.org/ NOUN ital-3222 217 43 ( ( PUNCT ital-3222 217 44 accessed access VERB ital-3222 217 45 may may PROPN ital-3222 217 46 7 7 NUM ital-3222 217 47 , , PUNCT ital-3222 217 48 2009 2009 NUM ital-3222 217 49 ) ) PUNCT ital-3222 217 50 . . PUNCT ital-3222 218 1 24 24 NUM ital-3222 218 2 . . PUNCT ital-3222 219 1 przemysław przemysław PROPN ital-3222 219 2 skibiński skibiński PROPN ital-3222 219 3 and and CCONJ ital-3222 219 4 szymon szymon ADJ ital-3222 219 5 grabowski grabowski PROPN ital-3222 219 6 , , PUNCT ital-3222 219 7 “ " PUNCT ital-3222 219 8 variablelength variablelength NOUN ital-3222 219 9 contexts context NOUN ital-3222 219 10 for for ADP ital-3222 219 11 ppm ppm ADJ ital-3222 219 12 , , PUNCT ital-3222 219 13 ” " PUNCT ital-3222 219 14 proceedings proceeding NOUN ital-3222 219 15 of of ADP ital-3222 219 16 the the DET ital-3222 219 17 ieee ieee NOUN ital-3222 219 18 data data PROPN ital-3222 219 19 compression compression NOUN ital-3222 219 20 conference conference NOUN ital-3222 219 21 ( ( PUNCT ital-3222 219 22 snowbird snowbird PROPN ital-3222 219 23 , , PUNCT ital-3222 219 24 utah utah PROPN ital-3222 219 25 , , PUNCT ital-3222 219 26 2004 2004 NUM ital-3222 219 27 ): ): PUNCT ital-3222 219 28 409 409 NUM ital-3222 219 29 . . PUNCT ital-3222 220 1 alcts alct NOUN ital-3222 220 2 cover cover VERB ital-3222 220 3 2 2 NUM ital-3222 220 4 lita lita NOUN ital-3222 220 5 cover cover VERB ital-3222 220 6 3 3 NUM ital-3222 220 7 , , PUNCT ital-3222 220 8 cover cover VERB ital-3222 220 9 4 4 NUM ital-3222 220 10 index index NOUN ital-3222 220 11 to to ADP ital-3222 220 12 advertisers advertiser NOUN