On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the...
-
Upload
erika-reed -
Category
Documents
-
view
222 -
download
1
Transcript of On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the...
![Page 1: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/1.jpg)
OnOn
Shmuel Tomi KleinShmuel Tomi KleinBar Ilan UniversityBar Ilan University
BackBackspacspacee
Dana ShapiraDana ShapiraAshkelon Academic CollegeAshkelon Academic College
the the UselfulneUselfulnessssofof
![Page 2: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/2.jpg)
Extension of study ofExtension of study of NEGATIONNEGATIONin large IR systemsin large IR systems
United -Nations
Edgar (-1:2) Po
Backspace
Not really a character, but can be useful
![Page 3: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/3.jpg)
Three applicationsThree applications
Handling large numbers
Text compression in IR
Blockwise Huffman decoding
![Page 4: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/4.jpg)
Handling large numbers
1
Syntax:
A (1:3) -B (1:5) C -D (1:1) E
In use at the Responsa Project
![Page 5: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/5.jpg)
Handling large numbers
1
Too many large numbers
Break in blocks of k digits
1234567 1234 567
Problem with precision:
5678 also retrieves 123456789
![Page 6: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/6.jpg)
Handling large numbers
1
Each word includes a trailing blank
House of Lords
I declared an income of 1000000 on my last 10 1040 forms
Long numbers use Backspace BS
1234567890 1234 BS 5678 BS 90
I declared an income of 1000 BS 000 on my last 10 1040 forms1 2 3 4 5 6 7 8 9 10 11 12 13 14
![Page 7: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/7.jpg)
Handling large numbers
1
234 -BS 234
To search for submit query
2000 1040 -BS 2000 1040 -BS
12345678 -BS 1234 BS 5678 -BS
1234567 -BS 1234 BS 567
[email protected] user @ BS addr . BS com
![Page 8: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/8.jpg)
Text Compression in IR
2
Huffword: alternating words and non-wordsUse single Huffman tree for:
— words including a trailing blank
— punctuation signs: BS ;
— Backspace, to handle exceptions
![Page 9: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/9.jpg)
Text Compression in IR
2
FileFileSizeSizeHuffwordHuffwordBSHuffBSHuffgzipgzipbzipbzip
EnglisEnglishh
3.1M3.1MBB
3.913.913.973.973.283.284.414.41
FrencFrenchh
7.1M7.1MBB
3.983.984.034.033.273.274.634.63
![Page 10: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/10.jpg)
Given
Alphabet
np
p
1
na
a
1
nl
l
1
with
probabilities
find
lengths
such that
average length
i
n
iilp
1
is minimized
A
B
D
C
E
10
0
0
0
1
1
1
A B C D E
0.4 0.3 0.1 0.1 0.1
1 2 3 4 4
HUFFMAN
0 11 101 1000 1001
Blockwise Huffman decoding
3
![Page 11: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/11.jpg)
Table Entry Pattern Decoding
0 1 001 0 0 1 A A Rem
A
B
D
C
E
10
0
0
0
1
1
1
0
1
2
3 1 6 1110 11 10 B Rem
3 3 100011 1000 11 D B
Decoding Decoding kk bits together bits togetherPartial decoding tables
![Page 12: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/12.jpg)
Decoding Decoding kk bits together bits togetherPartial decoding tables
0
1
3
2
A
B
D
C
E
Pattern Pattern for for
Table 0Table 0
Table Table 00Table Table 11Table Table 22Table Table 33
WWllWWllWWllWWll
00000000AAAAAA00DD00DADA00DAADAA00
11001001AAAA11EE00DD11DADA11
22010010AA22CACA00EAEA00DD22
33011011ABAB00CC11EE11DBDB00
44100100--33BAABAA00CAACAA00EAAEAA00
55101101CC00BABA11CACA11EAEA11
66110110BABA00BB22CC22EE22
77111111BB11BBBB00CBCB00EBEB00
Prefix:
Λ 10 100
1
![Page 13: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/13.jpg)
Pattern for Pattern for Table 0Table 0
Table Table 00Table Table 11Table Table 22Table Table 33
WWllWWllWWllWWll
00000000AAAAAA00DD00DADA00DAADAA0011001001AAAA11EE00DD11DADA1122010010AA22CACA00EAEA00DD2233011011ABAB00CC11EE11DBDB0044100100--33BAABAA00CAACAA00EAAEAA0055101101CC00BABA11CACA11EAEA1166110110BABA00BB22CC22EE2277111111BB11BBBB00CBCB00EBEB00
0j
for 1f to EOI(output , j) ← T( j , M [ f ; f + k –
1] ) kff
100
101
101
000110
j 0 3
-
1
EA C
0
DA
2
B
Decoding Algorithm
100
101
000110
101outp
ut
A 0B 11C 101D 1000E 1001
![Page 14: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/14.jpg)
Looking for new tradeoffs
0
1
3
2
A
B
D
C
E
Reduced Reduced Partial Partial
decoding decoding tablestables
includingincludingbackspacesbackspaces
0
3
![Page 15: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/15.jpg)
Pattern for Pattern for Table 0Table 0
Table Table 00Table Table 33
WWllbbWWllbb
00000000AAAAAA0000DAADAA0000
11001001AAAA0011DADA0011
22010010AA0022DD0022
33011011ABAB0000DBDB0000
44100100--3300EAAEAA0000
55101101CC0000EAEA0011
66110110BABA0000EE0022
77111111BB0011EBEB0000
Revised Decoding Algorith
m
0j
for to EOI(output , j ) ← T( j , M [ f ; f + k –
1] )kff
1f
0back
, back
– back
1 0 0 1 0 1 1 1 0 0 0 0 1 0 1
EA- DA C
1 -
1
B
1
Reduced Reduced tablestables
A 0B 11C 101D 1000E 1001
![Page 16: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/16.jpg)
1 0 0 1 0 1 1 1 0 0 0 0 1 0 1
E A B D A CRegular Huffman
- EA B DA CPartial decoding
tables
- EA B - DA CReduced tables
with backspace
![Page 17: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/17.jpg)
BitBit
partialpartial
decoddecodee
tablestables
reducereducedd
tablestables
kk118888
WSJWSJ
bpbpaa
11886.46.4
MB/MB/secsec6.66.6--7.67.6
RARAMM2.12.119719734.34.
11
Experimental results
![Page 18: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/18.jpg)
BitBit
partialpartial
decoddecodee
tablestables
reducereducedd
tablestables
kk118888
KJVKJV
bpbpaa
11886.46.4
MB/MB/secsec
10.10.11
0.40.413.13.77
RARAMM
0.20.211
17178.78.7
Experimental results
![Page 19: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/19.jpg)
Conclusion
3 examples of IR applications
Use of conceptual elements,
like backspaces, may improve
algorithms.
![Page 20: On Shmuel Tomi Klein Bar Ilan University Back space Dana Shapira Ashkelon Academic College the Uselfulness of.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649e765503460f94b77313/html5/thumbnails/20.jpg)
Thank you !Thank you !
Questions?Questions?