Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs...
Transcript of Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs...
![Page 1: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/1.jpg)
![Page 2: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/2.jpg)
![Page 3: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/3.jpg)
![Page 4: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/4.jpg)
![Page 5: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/5.jpg)
![Page 6: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/6.jpg)
![Page 7: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/7.jpg)
![Page 8: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/8.jpg)
![Page 9: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/9.jpg)
![Page 10: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/10.jpg)
![Page 11: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/11.jpg)
![Page 12: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/12.jpg)
![Page 13: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/13.jpg)
![Page 14: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/14.jpg)
![Page 15: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/15.jpg)
![Page 16: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/16.jpg)
![Page 17: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/17.jpg)
![Page 18: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/18.jpg)
![Page 19: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/19.jpg)
![Page 20: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/20.jpg)
![Page 21: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/21.jpg)
![Page 22: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/22.jpg)
1
groonga storage engine
Brazil, Inc. Tasuku SUENAGA a.k.a. gunyarakun
![Page 23: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/23.jpg)
What I talk about
2
![Page 24: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/24.jpg)
3
MySQL fulltext index • Phrase search is slow. • Updating index is slow. • Cannot combine full text search index with other indexes(like B-Tree).
Our prior product Tritonn solves.
![Page 25: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/25.jpg)
Tritonn • Tritonn = groonga + patches for MyISAM
4
![Page 26: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/26.jpg)
5
MySQL v.s. Tritonn MySQL(5.0)’s Fulltext Index Tritonn
Index size 109 MB 1028 MB Phrase search for ‘united states’ 44.91 sec 0.40 sec Indexing after inserting recs 1,474 sec 1,808 sec Inserting recs after idx. creation 28,182 sec 1,839 sec Where MATCH AGAINST and order by primary key
20.33 sec 0.89 sec Where MATCH AGAINST and primary key > 200000
6.55 sec 0.32 sec
Target dataset : Wikipedia English 458,713 record 1088MB
![Page 27: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/27.jpg)
So Tritonn provides … • Fast phrase search • Fast index update (realtime) • Works well with other indexes.
6 6 6
But some problems remain.
![Page 28: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/28.jpg)
Remaining problems • MyISAM based ‒ Table lock • when updating table, read accesses are blocked.
• Patch based ‒ Patch maintainance and building patched MySQL is too messy.
7 7
Need for a new solution.
![Page 29: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/29.jpg)
New solution is • groonga storage engine ‒ Use column store of groonga instead of MyISAM. ‒ Not patch but storage engine.
8
Tritonn (old) groonga storage engine(new)
![Page 30: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/30.jpg)
Advantage • Table lock free ‒ Column store of groonga is lock-free.
• Only access columns required ‒ Not row-based.
• Easy to build and develop
9 9 9
And some optimization for typical queries
![Page 31: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/31.jpg)
Optimization(1) • COUNT(*) optimization. ‒ For queries like below.
10
SELECT COUNT(*) FROM table WHERE MATCH(col) AGAINST (‘query’);
![Page 32: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/32.jpg)
Optimization(2) • ORDER BY score and LIMIT optimization. ‒ For queries like below.
11
SELECT * FROM table WHERE MATCH(col) AGAINST (‘query’)
ORDER BY MATCH(col) AGAINST (‘query’)
LIMIT 10;
![Page 33: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/33.jpg)
Conclusion of my part • groonga storage engine provides
• Fast phrase search • Fast index update (realtime) • Inserting records doesn’t block reading records
12
![Page 34: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/34.jpg)
The combination of Groonga and Spider
Kentoku SHIBA kentokushiba at gmail dot com
![Page 35: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/35.jpg)
The combination of Groonga and Spider
In this time ...
![Page 36: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/36.jpg)
What is Spider Storage Engine
DB1 tbl_a
1.Request
2.Just connect to spider
3.Response
DB2 tbl_b
DB3 tbl_c
AP
SPIDER
Spider Storage Engine is a storage engine for database sharding transparently.
![Page 37: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/37.jpg)
The combination of Groonga and Spider
You can get following power by combination of Groonga and Spider.
- The optimization for the fulltext searching with sorting by score. - The optimization for the sorting by range partition key column. - The optimization for the fulltext searching with filtering by
partition key column.
![Page 38: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/38.jpg)
The optimization for the fulltext searching with sorting by score
(The case of scanning all partitions)
![Page 39: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/39.jpg)
Sorting by score
- Parallel searching - 2 step limitation
DB1
t1
DB2
t1
DB3
t1
DB4
t1
SELECT * FROM t1 WHERE MATCH(c2) AGAINST('hoge') ORDER BY _score LIMIT 100;
SELECT * FROM t1 WHERE MATCH(c2) AGAINST('hoge') ORDER BY _score LIMIT 100;
1
2 2 2 Parallel searching is comming soon.
![Page 40: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/40.jpg)
The optimization for the sorting by range partition key column
(coming soon)
![Page 41: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/41.jpg)
The sorting by range partition key column
- Sort optimization with range partition
DB1
t1
DB2
t1
DB3
t1
DB4
t1
SELECT * FROM t1 WHERE MATCH(c2) AGAINST('hoge') ORDER BY c1 LIMIT 100;
SELECT * FROM t1 WHERE MATCH(c2) AGAINST('hoge') ORDER BY c1 LIMIT 100; (LIMIT value is decreasing gradually)
1
2 3 4
c1 < 50 c1 < 100 c1 >= 100
![Page 42: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/42.jpg)
The optimization for the fulltext searching with filtering by partition key column
![Page 43: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/43.jpg)
The filtering by partition key column
- Partition pruning
DB1
t1
DB2
t1
DB3
t1
DB4
t1
SELECT * FROM t1 WHERE MATCH(c2) AGAINST('hoge') AND c1 = 60 ORDER BY _score LIMIT 100;
SELECT * FROM t1 WHERE MATCH(c2) AGAINST('hoge') AND c1 = 60 ORDER BY _score LIMIT 100;
1
2
c1 < 50 c1 < 100 c1 >= 100
![Page 44: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/44.jpg)
End of the session
![Page 45: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/45.jpg)
Source code and binary
If you want to try introduced Spider features,
you can download from here and try.
source code http://groonga.org/pkg/mysql-5.5.8-spider-2.24h-vp-0.13-hs-1.0.src.tgz
binary (Linux x86_64 glibc2.3)
http://groonga.org/pkg/mysql-5.5.8-spider-2.24h-vp-0.13-hs-1.0.bin.tgz
initialize SQL http://groonga.org/pkg/spider-init-2.24-for-5.5.8.tgz
![Page 46: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/46.jpg)
Contact us
If you have some questions, comments or suggestions, please contact us from here.
http://bit.ly/fSs5vx
![Page 47: Mroonga - Fast fulltext search for all languages on MySQL · 5 MySQL v.s. Tritonn MySQL(5.0)ʼs Fulltext Index Tritonn Index size 109 MB 1028 MB Phrase search for ‘united states’](https://reader034.fdocuments.us/reader034/viewer/2022051605/600a1b37706ec423dc7897a5/html5/thumbnails/47.jpg)
Kentoku SHIBA (kentokushiba at gmail dot com)
Thank you for
taking your time!!
Daijiro MORI (morita at razil dot jp) Tasuku SUENAGA (a at razil dot jp)