SlideShare a Scribd company logo
Fast Wavelet Tree Construction
in Practice
2018/10/10
Yusaku Kaneta
Rakuten Institute of Technology
Rakuten, Inc.
2
Background
§ Wavelet tree [Grossi+,SODA’03] / Wavele matrix [Claude+,SPIRE’12]
• Fundamentaltoolsinmodernstringprocessing.
• Fastaccess,rank,andselectonanarrayofnintegersin[0,σ).
10 1 1 0 1 1 000 10 10 01
11 0 01 1 0100 11 00 0 1
1 0 100 01 1 00 0 11 1 10
00 101 10 1 10 1 01 00 1
Awavelettreeson[6,8,9,4,14,11,1,0,5,7,12,13,15,2,3,10]
3
Fast WT construction has attracted much attension!!
Papers
Sequential?
Parallel?
Impl.?
[Fuentes-Sepúlveda+,SEA’14] P Yes
[Shun+,DCC’15] P Yes
[Labeit+,DCC’16] P Yes
[Fischer+,ALENEX’18] S+P Yes
[Munro+,SPIRE’14][Babenko+,SODA’15](Bestupperbound) S No
[Shun+,DCC’17](Bestupperbound) P No
§ Gap between theory and practice:
• Nopracticalimplementationof[Munro+,SPIRE’14][Babenko+,
SODA’15]:Thecurrentfastestwavelettreeconstructionalgorithm.
4
First* practical implementation of wavelet tree construction
based on [Munro+, SPIRE’14][Babenko+, SODA’15]
Main result
§ Our idea: Replace precomputed tables w/
• SpecialCPUinstruction:PEXTin(BMI2)orPSHUFB(inSSSE3).
• Broadwordcomputation(omitted).
§ Experiments on real datasets:
• Wavelettree:ourswerecompetitivetoSOTA[Fischer+,ALENEX'18]
• Waveletmatrix:ourswere1.1–1.9xfasterthanSOTA.
*SzymonGrabowskikindlypointedthatTuukkaNorrialsotriedsimilarapproaches:
github.com/tsnorri/wt-construct-gn
5
Our Techniques
for Practical Implementation
6
Basic construction of wavelet trees
§ Recursivelysplita(sub)arrayofelements
accordingtotheiri-thtargetbits.
• Processoneelementatatime.
• Appenditstargetbittoabitvector.
• Appendittotheleft(resp.,right)subarray
ifitstargetbitis0(resp.,1).
Assumewordlengthw = 32,inputintegerwidtht =4,andthusw/t = 8 forexplanation.
6
0110
8
1000
9
1001
4
0100
14
1110
11
1011
1
0001
0
0000
5
0101
7
0111
12
1100
15
1111
13
1101
2
0010
3
0011
10
1010
7
Basic construction of wavelet trees
§ Recursivelysplita(sub)arrayofelements
accordingtotheiri-thtargetbits.
• Processoneelementatatime.
• Appenditstargetbittoabitvector.
• Appendittotheleft(resp.,right)subarray
ifitstargetbitis0(resp.,1).
Assumewordlengthw = 32,inputintegerwidtht =4,andthusw/t = 8 forexplanation.
6
0110
8
1000
9
1001
4
0100
14
1110
11
1011
1
0001
0
0000
5
0101
7
0111
12
1100
15
1111
13
1101
2
0010
3
0011
10
1010
8
6
0110
8
1000
9
1001
4
0100
14
1110
11
1011
1
0001
0
0000
5
0101
7
0111
12
1100
15
1111
13
1101
2
0010
3
0011
10
1010
6
0110
8
1000
9
1001
4
0100
14
1110
11
1011
1
0001
0
0000
Fast construction of wavelet trees
§ Fastwavelettreeconstruction
[Munro+, SPIRE’14][Babenko+,SODA’15]
• Processmultipleelementsatatime.
• w/toft-bitelementscanbereadtogether.
§ Primitiveoperations:
X Bbitpack ( ,i)=
X X0listsplit( ,i)=( , )X1
X1
A subarray consisting of
X’s elements whose 0th bit 1
A subarray consisting of
X’s elements whose 0th bit 0
X0
First w/t elements in a word of w bits (e.g., w = 32 and t = 4)
X
Packed 0th bits of elements contained in X.
B
Assumewordlengthw = 32,inputintegerwidtht =4,andthusw/t = 8 forexplanation.
Assumption:
• ThestandardwordRAM
• w:wordlength(inbits)
• t:inputintegerwidth(inbits)
• t≤w justforexplanation.
(Thisconditioncanbeeliminated.)
9
Main idea: Two special CPU instructions
ParallelbitsEXTract
PEXT(X,Y)=Z
PacksbitsinXaccordingtoY
suchthatforalli,
bit(Z,i)=bit(X,j)holds.
• bit(a,i):i-thbitofa.
• select1(a,i):indexofy’si-th1.
• j=select1(Y,i).
ParallelSHUFfleBytes
PSHUFB(X,Y)=Z
Permutest-bitblocksinXaccordingtoY
suchthatforalli,
block(Z,i)=block(X,j)holds.
• block(a,i):i-tht-bitblockofa.
• j=block(Y,i).
• Inpractice,w=64andt=8arerequired.
10
PEXT-based technique for bitpack
§ Preprocessing:
• L: Packedarraywithblock(L,i) = 1foreveryiin[0,w/t)
(i.e.,eacht-bitblockhas1onlyatitslowestbit).
Assumewordsizew = 32,elementsizet =4,andthusw/t = 8 forexplanation.
01100100000100000101011100100011
00011001000001000001010111001000
>>t-i-1
00011001000001000001010111001000
00010001000100010001000100010001L
Z
Y
Y
X
11001100000000000000000000000000
1. Shift X by t-i-1=2 2. Perform PEXT(Y,L)=Z
PEXT(Y,L)
bitpack(X= ,i=1):01100100000100000101011100100011
6 4 1 0 5 7 2 3
11
PEXT-based technique for listsplit
§ Preprocessing:
• H: Packedarraywithblock(H,i) = 2t-1foreveryiin[0,w/t)
(i.e.,eacht-bitblockhas1onlyatitshighestbit).
2. Perform PEXT(X,M1)=Z1 w/
M1=~M0
6 4 01 2 3
0 0 046 05 7
f f 0ff 00 0
75
M1
Z1
X
1. Perform PEXT(X,M0)=Z0 w/
M0=(H-((X>>(t-i-1))&L)^H)^H
0 0 f00 ff f
6 4 701 5 2 3
0 0 001 02 3
M0
Z0
X
listsplit(X= ,i=1):
PEXT(X, M0) PEXT(Z, M1)
6 4 701 5 2 3
01100100000100000101011100100011
Assumewordsizew = 32,elementsizet =4,andthusw/t = 8 forexplanation.
01100100000100000101011100100011 01100100000100000101011100100011
12
PSHUFB-based technique for listsplit
§ Preprocessing: Let m=2w/t be # of blocks in a word.
• T[a]forallain[0,m):Packedarraycontaininginascendingorder
(1)allindexesof0’sfollowedby(2)allindexesof1’sin a.
2. Extract each part from Y.
|a|1: Number of ones in a.
<<t|a|1
6 4 701 52 3
0 0 001 02 3Z0
Y 6 4 701 52 3
0 0 046 05 7Z1
Y
00010000001000110110010001010111 00010000001000110110010001010111
1. Perform PSHUFB(X,T[a]) w/
a=bitpack(X,i)=11001100
0 1 532 46 7
01100100000100000101011100100011
6 4 701 5 2 3
6 4 701 52 3
T[a]
Y
X
Assumewordsizew = 32,elementsizet =4,andthusw/t = 8 forexplanation.
>>t|a|1
<<w-t|a|1PSHUFB(X,T[a])
6 4 701 5 2 3
01100100000100000101011100100011
listsplit(X= ,i=1):
13
Experiments
14
Experimental setup and data
§ Setup:
• Corei7-4790(3.6GHz)w/16GBmainmemoryrunningUbuntu18.04.
§ Data:
• 6and10datasetsfromPizzaandChilliandLightweightCorpus,resp.
§ Methods:
• PSHUFB:ourmethodbasedonPSHUFBinstructioninSSSE3.
• PEXT:ourmethodbasedonPEXTinstructioninBMI2.
• NAÏVE,PS,PC:previousones(availableatgithub.com/kurpicz/pwd)
bitpackwas implemented by PEXT in allour methods.
15
Result: wavelet tree
§ PSHUFB vs.NAÏVE:
• 1.9x(average)
§ PEXT vs.NAÏVE:
• 1.9x(average)
§ Oursv.s.PC/PS:
• Competitive
1st winner
2nd winner
Medianof5elapsedtimes(insec)forconstructing
awavelettreewithoutrankandselectindexes.
n σ NAÏVE PC PS PSHUFB PEXT
dblp.xml 2.96·108 97 5.57 2.99 3.03 3.09 3.04
dna 4.04·108 16 4.43 2.42 2.72 2.07 2.05
english 2.21·109 238 53.0 27.2 28.8 23.7 23.5
pitches 5.58·107 132 1.25 0.685 0.812 0.576 0.570
proteins 1.18·109 27 16.1 8.29 8.67 9.27 9.12
sources 2.11·108 229 4.94 2.54 2.61 2.37 2.55
chr22.dna 3.46·107 5 0.233 0.143 0.188 0.157 0.156
etext99 1.05·108 145 2.32 1.31 1.62 1.16 1.14
gcc-3.0.tar 8.66·107 149 1.91 1.32 1.12 0.949 0.935
howto 3.94·107 196 0.832 0.478 0.496 0.438 0.432
jdk13c 6.97·107 113 1.30 0.708 0.789 0.829 0.755
linux-
2.4.5.tar
1.16·108 255 2.76 1.41 1.45 1.30 1.51
rctail96 1.15·108 93 2.25 1.18 1.20 1.19 1.17
rfc 1.16·108 120 2.28 1.25 1.27 1.20 1.19
sprot34.dat 1.10·108 66 2.12 1.14 1.36 1.13 1.13
w3c2 1.04·108 255 2.30 1.30 1.28 1.30 1.18
OurPSHUFB andPEXT outperformedNAÏVE whilebeingcompatiblewithPC.
16
Result: wavelet matrix
§ PSHUFB vs.NAÏVE:
• 3.0–4.6x
§ PEXT vs.NAÏVE:
• 2.5–4.5x
§ Oursv.s.PC:
• 1.1–1.9x
1st winner
2nd winner
n σ NAÏVE PC PS PSHUFB PEXT
dblp.xml 2.96·108 97 6.20 3.00 3.01 2.02 2.05
dna 4.04·108 16 5.88 2.43 2.70 1.29 1.30
english 2.21·109 238 57.0 27.2 28.5 15.8 15.9
pitches 5.58·107 132 1.37 0.684 0.709 0.429 0.547
proteins 1.18·109 27 22.4 8.29 8.41 6.36 6.43
sources 2.11·108 229 5.81 2.54 2.95 1.57 1.59
chr22.dna 3.46·107 5 0.385 0.143 0.164 0.130 0.092
etext99 1.05·108 145 3.01 1.27 1.34 0.803 0.771
gcc-3.0.tar 8.66·107 149 2.18 1.05 1.29 0.633 0.639
howto 3.94·107 196 1.011 0.478 0.494 0.295 0.298
jdk13c 6.97·107 113 1.57 0.708 0.705 0.500 0.506
linux-
2.4.5.tar
1.16·108 255 3.14 1.41 1.64 0.872 1.11
rctail96 1.15·108 93 2.37 1.22 1.19 0.779 0.792
rfc 1.16·108 120 2.65 1.30 1.28 0.791 0.811
sprot34.dat 1.10·108 66 2.81 1.14 1.38 0.905 0.855
w3c2 1.04·108 255 2.63 1.54 1.27 0.80 0.81
Medianof5elapsedtimes(insec)forconstructing
awaveletmatrixwithoutrankandselectindexes.
Our PSHUFB and PEXT outperformed all others: NAÏVE, PS, and PC.
17
Conclusion
§ Practical wavelet tree construction using PEXT/PSHUFB.
• Based on [Munro+, SPIRE’14] and [Babenko+, SODA’15]
§ Experiments on real datasets:
• Wavelet tree: Faster than NAÏVE and competitive w/ SOTA:
prefix sorting (PS) and prefix counting (PC).
• Wavelet matrix: Faster than NAÏVE, PS, and PC.
§ Future work
• Exploit moreparallelismin CPUcoresand/orSIMD registers.
Fast Wavelet Tree Construction in Practice
Ad

More Related Content

What's hot (20)

Cache-Oblivious データ構造入門 @DSIRNLP#5
Cache-Oblivious データ構造入門 @DSIRNLP#5Cache-Oblivious データ構造入門 @DSIRNLP#5
Cache-Oblivious データ構造入門 @DSIRNLP#5
Takuya Akiba
 
pg_stat_activityの不可解な観測結果の謎 (第47回 PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_stat_activityの不可解な観測結果の謎 (第47回 PostgreSQLアンカンファレンス@オンライン 発表資料)pg_stat_activityの不可解な観測結果の謎 (第47回 PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_stat_activityの不可解な観測結果の謎 (第47回 PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
20111015 勉強会 (PCIe / SR-IOV)
20111015 勉強会 (PCIe / SR-IOV)20111015 勉強会 (PCIe / SR-IOV)
20111015 勉強会 (PCIe / SR-IOV)
Kentaro Ebisawa
 
第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法
第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法
第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法
ksmzn
 
プログラミングコンテストでのデータ構造
プログラミングコンテストでのデータ構造プログラミングコンテストでのデータ構造
プログラミングコンテストでのデータ構造
Takuya Akiba
 
CUDAのアセンブリ言語基礎のまとめ PTXとSASSの概説
CUDAのアセンブリ言語基礎のまとめ PTXとSASSの概説CUDAのアセンブリ言語基礎のまとめ PTXとSASSの概説
CUDAのアセンブリ言語基礎のまとめ PTXとSASSの概説
Takateru Yamagishi
 
SDL2の紹介
SDL2の紹介SDL2の紹介
SDL2の紹介
nyaocat
 
ChordアルゴリズムによるDHT入門
ChordアルゴリズムによるDHT入門ChordアルゴリズムによるDHT入門
ChordアルゴリズムによるDHT入門
Hiroya Nagao
 
Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...
Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...
Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...
NTT DATA Technology & Innovation
 
わかりやすいパターン認識_2章
わかりやすいパターン認識_2章わかりやすいパターン認識_2章
わかりやすいパターン認識_2章
weda654
 
Linux女子部 systemd徹底入門
Linux女子部 systemd徹底入門Linux女子部 systemd徹底入門
Linux女子部 systemd徹底入門
Etsuji Nakai
 
Dockerと外部ルータを連携させる仕組みを作ってみた
Dockerと外部ルータを連携させる仕組みを作ってみたDockerと外部ルータを連携させる仕組みを作ってみた
Dockerと外部ルータを連携させる仕組みを作ってみた
npsg
 
C++の話(本当にあった怖い話)
C++の話(本当にあった怖い話)C++の話(本当にあった怖い話)
C++の話(本当にあった怖い話)
Yuki Tamura
 
インフラエンジニアのためのRancherを使ったDocker運用入門
インフラエンジニアのためのRancherを使ったDocker運用入門インフラエンジニアのためのRancherを使ったDocker運用入門
インフラエンジニアのためのRancherを使ったDocker運用入門
Masahito Zembutsu
 
Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?
NTT DATA OSS Professional Services
 
最小カットを使って「燃やす埋める問題」を解く
最小カットを使って「燃やす埋める問題」を解く最小カットを使って「燃やす埋める問題」を解く
最小カットを使って「燃やす埋める問題」を解く
shindannin
 
Apache Arrow - データ処理ツールの次世代プラットフォーム
Apache Arrow - データ処理ツールの次世代プラットフォームApache Arrow - データ処理ツールの次世代プラットフォーム
Apache Arrow - データ処理ツールの次世代プラットフォーム
Kouhei Sutou
 
画像認識のための深層学習
画像認識のための深層学習画像認識のための深層学習
画像認識のための深層学習
Saya Katafuchi
 
GPUとSSDがPostgreSQLを加速する~クエリ処理スループット10GB/sへの挑戦~ [DB Tech Showcase Tokyo/2017]
GPUとSSDがPostgreSQLを加速する~クエリ処理スループット10GB/sへの挑戦~ [DB Tech Showcase Tokyo/2017]GPUとSSDがPostgreSQLを加速する~クエリ処理スループット10GB/sへの挑戦~ [DB Tech Showcase Tokyo/2017]
GPUとSSDがPostgreSQLを加速する~クエリ処理スループット10GB/sへの挑戦~ [DB Tech Showcase Tokyo/2017]
Kohei KaiGai
 
Cache-Oblivious データ構造入門 @DSIRNLP#5
Cache-Oblivious データ構造入門 @DSIRNLP#5Cache-Oblivious データ構造入門 @DSIRNLP#5
Cache-Oblivious データ構造入門 @DSIRNLP#5
Takuya Akiba
 
pg_stat_activityの不可解な観測結果の謎 (第47回 PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_stat_activityの不可解な観測結果の謎 (第47回 PostgreSQLアンカンファレンス@オンライン 発表資料)pg_stat_activityの不可解な観測結果の謎 (第47回 PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_stat_activityの不可解な観測結果の謎 (第47回 PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
20111015 勉強会 (PCIe / SR-IOV)
20111015 勉強会 (PCIe / SR-IOV)20111015 勉強会 (PCIe / SR-IOV)
20111015 勉強会 (PCIe / SR-IOV)
Kentaro Ebisawa
 
第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法
第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法
第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法
ksmzn
 
プログラミングコンテストでのデータ構造
プログラミングコンテストでのデータ構造プログラミングコンテストでのデータ構造
プログラミングコンテストでのデータ構造
Takuya Akiba
 
CUDAのアセンブリ言語基礎のまとめ PTXとSASSの概説
CUDAのアセンブリ言語基礎のまとめ PTXとSASSの概説CUDAのアセンブリ言語基礎のまとめ PTXとSASSの概説
CUDAのアセンブリ言語基礎のまとめ PTXとSASSの概説
Takateru Yamagishi
 
SDL2の紹介
SDL2の紹介SDL2の紹介
SDL2の紹介
nyaocat
 
ChordアルゴリズムによるDHT入門
ChordアルゴリズムによるDHT入門ChordアルゴリズムによるDHT入門
ChordアルゴリズムによるDHT入門
Hiroya Nagao
 
Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...
Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...
Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...
NTT DATA Technology & Innovation
 
わかりやすいパターン認識_2章
わかりやすいパターン認識_2章わかりやすいパターン認識_2章
わかりやすいパターン認識_2章
weda654
 
Linux女子部 systemd徹底入門
Linux女子部 systemd徹底入門Linux女子部 systemd徹底入門
Linux女子部 systemd徹底入門
Etsuji Nakai
 
Dockerと外部ルータを連携させる仕組みを作ってみた
Dockerと外部ルータを連携させる仕組みを作ってみたDockerと外部ルータを連携させる仕組みを作ってみた
Dockerと外部ルータを連携させる仕組みを作ってみた
npsg
 
C++の話(本当にあった怖い話)
C++の話(本当にあった怖い話)C++の話(本当にあった怖い話)
C++の話(本当にあった怖い話)
Yuki Tamura
 
インフラエンジニアのためのRancherを使ったDocker運用入門
インフラエンジニアのためのRancherを使ったDocker運用入門インフラエンジニアのためのRancherを使ったDocker運用入門
インフラエンジニアのためのRancherを使ったDocker運用入門
Masahito Zembutsu
 
最小カットを使って「燃やす埋める問題」を解く
最小カットを使って「燃やす埋める問題」を解く最小カットを使って「燃やす埋める問題」を解く
最小カットを使って「燃やす埋める問題」を解く
shindannin
 
Apache Arrow - データ処理ツールの次世代プラットフォーム
Apache Arrow - データ処理ツールの次世代プラットフォームApache Arrow - データ処理ツールの次世代プラットフォーム
Apache Arrow - データ処理ツールの次世代プラットフォーム
Kouhei Sutou
 
画像認識のための深層学習
画像認識のための深層学習画像認識のための深層学習
画像認識のための深層学習
Saya Katafuchi
 
GPUとSSDがPostgreSQLを加速する~クエリ処理スループット10GB/sへの挑戦~ [DB Tech Showcase Tokyo/2017]
GPUとSSDがPostgreSQLを加速する~クエリ処理スループット10GB/sへの挑戦~ [DB Tech Showcase Tokyo/2017]GPUとSSDがPostgreSQLを加速する~クエリ処理スループット10GB/sへの挑戦~ [DB Tech Showcase Tokyo/2017]
GPUとSSDがPostgreSQLを加速する~クエリ処理スループット10GB/sへの挑戦~ [DB Tech Showcase Tokyo/2017]
Kohei KaiGai
 

Similar to Fast Wavelet Tree Construction in Practice (20)

Families of Triangular Norm Based Kernel Function and Its Application to Kern...
Families of Triangular Norm Based Kernel Function and Its Application to Kern...Families of Triangular Norm Based Kernel Function and Its Application to Kern...
Families of Triangular Norm Based Kernel Function and Its Application to Kern...
Okamoto Laboratory, The University of Electro-Communications
 
Computing the Nucleon Spin from Lattice QCD
Computing the Nucleon Spin from Lattice QCDComputing the Nucleon Spin from Lattice QCD
Computing the Nucleon Spin from Lattice QCD
Christos Kallidonis
 
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...
Demonstrating Quantum Speed-Up  with a Two-Transmon Quantum Processor Ph.D. d...Demonstrating Quantum Speed-Up  with a Two-Transmon Quantum Processor Ph.D. d...
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...
Andreas Dewes
 
pres06-main
pres06-mainpres06-main
pres06-main
Brian Donhauser
 
quantum chemistry on quantum computer handson by Q# (2019/8/4@MDR Hongo, Tokyo)
quantum chemistry on quantum computer handson by Q# (2019/8/4@MDR Hongo, Tokyo)quantum chemistry on quantum computer handson by Q# (2019/8/4@MDR Hongo, Tokyo)
quantum chemistry on quantum computer handson by Q# (2019/8/4@MDR Hongo, Tokyo)
Maho Nakata
 
材料科学とスーパーコンピュータ: 基礎編
材料科学とスーパーコンピュータ: 基礎編材料科学とスーパーコンピュータ: 基礎編
材料科学とスーパーコンピュータ: 基礎編
Michio Katouda
 
2018 MUMS Fall Course - Mathematical surrogate and reduced-order models - Ral...
2018 MUMS Fall Course - Mathematical surrogate and reduced-order models - Ral...2018 MUMS Fall Course - Mathematical surrogate and reduced-order models - Ral...
2018 MUMS Fall Course - Mathematical surrogate and reduced-order models - Ral...
The Statistical and Applied Mathematical Sciences Institute
 
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
BAINIDA
 
NNPDF3.1
NNPDF3.1NNPDF3.1
NNPDF3.1
juanrojochacon
 
Top500 november 2017
Top500 november 2017Top500 november 2017
Top500 november 2017
top500
 
C++ and Assembly: Debugging and Reverse Engineering
C++ and Assembly: Debugging and Reverse EngineeringC++ and Assembly: Debugging and Reverse Engineering
C++ and Assembly: Debugging and Reverse Engineering
corehard_by
 
SIAM SEAS Talk Slides
SIAM SEAS Talk SlidesSIAM SEAS Talk Slides
SIAM SEAS Talk Slides
Ryan White
 
Advances in the Solution of NS Eqs. in GPGPU Hardware. Second order scheme an...
Advances in the Solution of NS Eqs. in GPGPU Hardware. Second order scheme an...Advances in the Solution of NS Eqs. in GPGPU Hardware. Second order scheme an...
Advances in the Solution of NS Eqs. in GPGPU Hardware. Second order scheme an...
Storti Mario
 
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Alexander Litvinenko
 
20120214 optical pulse_measurement_wei-yi
20120214 optical pulse_measurement_wei-yi20120214 optical pulse_measurement_wei-yi
20120214 optical pulse_measurement_wei-yi
奕勳 陳
 
NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOW
Mark Chang
 
Introduction to Chainer Chemistry
Introduction to Chainer ChemistryIntroduction to Chainer Chemistry
Introduction to Chainer Chemistry
Preferred Networks
 
1st and 2nd Semester M Tech: Computer Science and Engineering (Dec-2015; Jan-...
1st and 2nd Semester M Tech: Computer Science and Engineering (Dec-2015; Jan-...1st and 2nd Semester M Tech: Computer Science and Engineering (Dec-2015; Jan-...
1st and 2nd Semester M Tech: Computer Science and Engineering (Dec-2015; Jan-...
BGS Institute of Technology, Adichunchanagiri University (ACU)
 
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
Developing fast  low-rank tensor methods for solving PDEs with uncertain coef...Developing fast  low-rank tensor methods for solving PDEs with uncertain coef...
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
Alexander Litvinenko
 
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljflAmber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
nordine19630
 
Computing the Nucleon Spin from Lattice QCD
Computing the Nucleon Spin from Lattice QCDComputing the Nucleon Spin from Lattice QCD
Computing the Nucleon Spin from Lattice QCD
Christos Kallidonis
 
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...
Demonstrating Quantum Speed-Up  with a Two-Transmon Quantum Processor Ph.D. d...Demonstrating Quantum Speed-Up  with a Two-Transmon Quantum Processor Ph.D. d...
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...
Andreas Dewes
 
quantum chemistry on quantum computer handson by Q# (2019/8/4@MDR Hongo, Tokyo)
quantum chemistry on quantum computer handson by Q# (2019/8/4@MDR Hongo, Tokyo)quantum chemistry on quantum computer handson by Q# (2019/8/4@MDR Hongo, Tokyo)
quantum chemistry on quantum computer handson by Q# (2019/8/4@MDR Hongo, Tokyo)
Maho Nakata
 
材料科学とスーパーコンピュータ: 基礎編
材料科学とスーパーコンピュータ: 基礎編材料科学とスーパーコンピュータ: 基礎編
材料科学とスーパーコンピュータ: 基礎編
Michio Katouda
 
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
BAINIDA
 
Top500 november 2017
Top500 november 2017Top500 november 2017
Top500 november 2017
top500
 
C++ and Assembly: Debugging and Reverse Engineering
C++ and Assembly: Debugging and Reverse EngineeringC++ and Assembly: Debugging and Reverse Engineering
C++ and Assembly: Debugging and Reverse Engineering
corehard_by
 
SIAM SEAS Talk Slides
SIAM SEAS Talk SlidesSIAM SEAS Talk Slides
SIAM SEAS Talk Slides
Ryan White
 
Advances in the Solution of NS Eqs. in GPGPU Hardware. Second order scheme an...
Advances in the Solution of NS Eqs. in GPGPU Hardware. Second order scheme an...Advances in the Solution of NS Eqs. in GPGPU Hardware. Second order scheme an...
Advances in the Solution of NS Eqs. in GPGPU Hardware. Second order scheme an...
Storti Mario
 
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Alexander Litvinenko
 
20120214 optical pulse_measurement_wei-yi
20120214 optical pulse_measurement_wei-yi20120214 optical pulse_measurement_wei-yi
20120214 optical pulse_measurement_wei-yi
奕勳 陳
 
NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOW
Mark Chang
 
Introduction to Chainer Chemistry
Introduction to Chainer ChemistryIntroduction to Chainer Chemistry
Introduction to Chainer Chemistry
Preferred Networks
 
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
Developing fast  low-rank tensor methods for solving PDEs with uncertain coef...Developing fast  low-rank tensor methods for solving PDEs with uncertain coef...
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
Alexander Litvinenko
 
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljflAmber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
nordine19630
 
Ad

More from Rakuten Group, Inc. (20)

EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
Rakuten Group, Inc.
 
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
Rakuten Group, Inc.
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり
Rakuten Group, Inc.
 
What Makes Software Green?
What Makes Software Green?What Makes Software Green?
What Makes Software Green?
Rakuten Group, Inc.
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Rakuten Group, Inc.
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組み
Rakuten Group, Inc.
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開
Rakuten Group, Inc.
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用
Rakuten Group, Inc.
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー
Rakuten Group, Inc.
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割
Rakuten Group, Inc.
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdf
Rakuten Group, Inc.
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
Rakuten Group, Inc.
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdf
Rakuten Group, Inc.
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdf
Rakuten Group, Inc.
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdf
Rakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
Rakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
Rakuten Group, Inc.
 
OWASPTop10_Introduction
OWASPTop10_IntroductionOWASPTop10_Introduction
OWASPTop10_Introduction
Rakuten Group, Inc.
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technology
Rakuten Group, Inc.
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情
Rakuten Group, Inc.
 
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
Rakuten Group, Inc.
 
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
Rakuten Group, Inc.
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり
Rakuten Group, Inc.
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Rakuten Group, Inc.
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組み
Rakuten Group, Inc.
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開
Rakuten Group, Inc.
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用
Rakuten Group, Inc.
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー
Rakuten Group, Inc.
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割
Rakuten Group, Inc.
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdf
Rakuten Group, Inc.
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
Rakuten Group, Inc.
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdf
Rakuten Group, Inc.
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdf
Rakuten Group, Inc.
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdf
Rakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
Rakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
Rakuten Group, Inc.
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technology
Rakuten Group, Inc.
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情
Rakuten Group, Inc.
 
Ad

Recently uploaded (20)

Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 

Fast Wavelet Tree Construction in Practice

  • 1. Fast Wavelet Tree Construction in Practice 2018/10/10 Yusaku Kaneta Rakuten Institute of Technology Rakuten, Inc.
  • 2. 2 Background § Wavelet tree [Grossi+,SODA’03] / Wavele matrix [Claude+,SPIRE’12] • Fundamentaltoolsinmodernstringprocessing. • Fastaccess,rank,andselectonanarrayofnintegersin[0,σ). 10 1 1 0 1 1 000 10 10 01 11 0 01 1 0100 11 00 0 1 1 0 100 01 1 00 0 11 1 10 00 101 10 1 10 1 01 00 1 Awavelettreeson[6,8,9,4,14,11,1,0,5,7,12,13,15,2,3,10]
  • 3. 3 Fast WT construction has attracted much attension!! Papers Sequential? Parallel? Impl.? [Fuentes-Sepúlveda+,SEA’14] P Yes [Shun+,DCC’15] P Yes [Labeit+,DCC’16] P Yes [Fischer+,ALENEX’18] S+P Yes [Munro+,SPIRE’14][Babenko+,SODA’15](Bestupperbound) S No [Shun+,DCC’17](Bestupperbound) P No § Gap between theory and practice: • Nopracticalimplementationof[Munro+,SPIRE’14][Babenko+, SODA’15]:Thecurrentfastestwavelettreeconstructionalgorithm.
  • 4. 4 First* practical implementation of wavelet tree construction based on [Munro+, SPIRE’14][Babenko+, SODA’15] Main result § Our idea: Replace precomputed tables w/ • SpecialCPUinstruction:PEXTin(BMI2)orPSHUFB(inSSSE3). • Broadwordcomputation(omitted). § Experiments on real datasets: • Wavelettree:ourswerecompetitivetoSOTA[Fischer+,ALENEX'18] • Waveletmatrix:ourswere1.1–1.9xfasterthanSOTA. *SzymonGrabowskikindlypointedthatTuukkaNorrialsotriedsimilarapproaches: github.com/tsnorri/wt-construct-gn
  • 6. 6 Basic construction of wavelet trees § Recursivelysplita(sub)arrayofelements accordingtotheiri-thtargetbits. • Processoneelementatatime. • Appenditstargetbittoabitvector. • Appendittotheleft(resp.,right)subarray ifitstargetbitis0(resp.,1). Assumewordlengthw = 32,inputintegerwidtht =4,andthusw/t = 8 forexplanation. 6 0110 8 1000 9 1001 4 0100 14 1110 11 1011 1 0001 0 0000 5 0101 7 0111 12 1100 15 1111 13 1101 2 0010 3 0011 10 1010
  • 7. 7 Basic construction of wavelet trees § Recursivelysplita(sub)arrayofelements accordingtotheiri-thtargetbits. • Processoneelementatatime. • Appenditstargetbittoabitvector. • Appendittotheleft(resp.,right)subarray ifitstargetbitis0(resp.,1). Assumewordlengthw = 32,inputintegerwidtht =4,andthusw/t = 8 forexplanation. 6 0110 8 1000 9 1001 4 0100 14 1110 11 1011 1 0001 0 0000 5 0101 7 0111 12 1100 15 1111 13 1101 2 0010 3 0011 10 1010
  • 8. 8 6 0110 8 1000 9 1001 4 0100 14 1110 11 1011 1 0001 0 0000 5 0101 7 0111 12 1100 15 1111 13 1101 2 0010 3 0011 10 1010 6 0110 8 1000 9 1001 4 0100 14 1110 11 1011 1 0001 0 0000 Fast construction of wavelet trees § Fastwavelettreeconstruction [Munro+, SPIRE’14][Babenko+,SODA’15] • Processmultipleelementsatatime. • w/toft-bitelementscanbereadtogether. § Primitiveoperations: X Bbitpack ( ,i)= X X0listsplit( ,i)=( , )X1 X1 A subarray consisting of X’s elements whose 0th bit 1 A subarray consisting of X’s elements whose 0th bit 0 X0 First w/t elements in a word of w bits (e.g., w = 32 and t = 4) X Packed 0th bits of elements contained in X. B Assumewordlengthw = 32,inputintegerwidtht =4,andthusw/t = 8 forexplanation. Assumption: • ThestandardwordRAM • w:wordlength(inbits) • t:inputintegerwidth(inbits) • t≤w justforexplanation. (Thisconditioncanbeeliminated.)
  • 9. 9 Main idea: Two special CPU instructions ParallelbitsEXTract PEXT(X,Y)=Z PacksbitsinXaccordingtoY suchthatforalli, bit(Z,i)=bit(X,j)holds. • bit(a,i):i-thbitofa. • select1(a,i):indexofy’si-th1. • j=select1(Y,i). ParallelSHUFfleBytes PSHUFB(X,Y)=Z Permutest-bitblocksinXaccordingtoY suchthatforalli, block(Z,i)=block(X,j)holds. • block(a,i):i-tht-bitblockofa. • j=block(Y,i). • Inpractice,w=64andt=8arerequired.
  • 10. 10 PEXT-based technique for bitpack § Preprocessing: • L: Packedarraywithblock(L,i) = 1foreveryiin[0,w/t) (i.e.,eacht-bitblockhas1onlyatitslowestbit). Assumewordsizew = 32,elementsizet =4,andthusw/t = 8 forexplanation. 01100100000100000101011100100011 00011001000001000001010111001000 >>t-i-1 00011001000001000001010111001000 00010001000100010001000100010001L Z Y Y X 11001100000000000000000000000000 1. Shift X by t-i-1=2 2. Perform PEXT(Y,L)=Z PEXT(Y,L) bitpack(X= ,i=1):01100100000100000101011100100011 6 4 1 0 5 7 2 3
  • 11. 11 PEXT-based technique for listsplit § Preprocessing: • H: Packedarraywithblock(H,i) = 2t-1foreveryiin[0,w/t) (i.e.,eacht-bitblockhas1onlyatitshighestbit). 2. Perform PEXT(X,M1)=Z1 w/ M1=~M0 6 4 01 2 3 0 0 046 05 7 f f 0ff 00 0 75 M1 Z1 X 1. Perform PEXT(X,M0)=Z0 w/ M0=(H-((X>>(t-i-1))&L)^H)^H 0 0 f00 ff f 6 4 701 5 2 3 0 0 001 02 3 M0 Z0 X listsplit(X= ,i=1): PEXT(X, M0) PEXT(Z, M1) 6 4 701 5 2 3 01100100000100000101011100100011 Assumewordsizew = 32,elementsizet =4,andthusw/t = 8 forexplanation. 01100100000100000101011100100011 01100100000100000101011100100011
  • 12. 12 PSHUFB-based technique for listsplit § Preprocessing: Let m=2w/t be # of blocks in a word. • T[a]forallain[0,m):Packedarraycontaininginascendingorder (1)allindexesof0’sfollowedby(2)allindexesof1’sin a. 2. Extract each part from Y. |a|1: Number of ones in a. <<t|a|1 6 4 701 52 3 0 0 001 02 3Z0 Y 6 4 701 52 3 0 0 046 05 7Z1 Y 00010000001000110110010001010111 00010000001000110110010001010111 1. Perform PSHUFB(X,T[a]) w/ a=bitpack(X,i)=11001100 0 1 532 46 7 01100100000100000101011100100011 6 4 701 5 2 3 6 4 701 52 3 T[a] Y X Assumewordsizew = 32,elementsizet =4,andthusw/t = 8 forexplanation. >>t|a|1 <<w-t|a|1PSHUFB(X,T[a]) 6 4 701 5 2 3 01100100000100000101011100100011 listsplit(X= ,i=1):
  • 14. 14 Experimental setup and data § Setup: • Corei7-4790(3.6GHz)w/16GBmainmemoryrunningUbuntu18.04. § Data: • 6and10datasetsfromPizzaandChilliandLightweightCorpus,resp. § Methods: • PSHUFB:ourmethodbasedonPSHUFBinstructioninSSSE3. • PEXT:ourmethodbasedonPEXTinstructioninBMI2. • NAÏVE,PS,PC:previousones(availableatgithub.com/kurpicz/pwd) bitpackwas implemented by PEXT in allour methods.
  • 15. 15 Result: wavelet tree § PSHUFB vs.NAÏVE: • 1.9x(average) § PEXT vs.NAÏVE: • 1.9x(average) § Oursv.s.PC/PS: • Competitive 1st winner 2nd winner Medianof5elapsedtimes(insec)forconstructing awavelettreewithoutrankandselectindexes. n σ NAÏVE PC PS PSHUFB PEXT dblp.xml 2.96·108 97 5.57 2.99 3.03 3.09 3.04 dna 4.04·108 16 4.43 2.42 2.72 2.07 2.05 english 2.21·109 238 53.0 27.2 28.8 23.7 23.5 pitches 5.58·107 132 1.25 0.685 0.812 0.576 0.570 proteins 1.18·109 27 16.1 8.29 8.67 9.27 9.12 sources 2.11·108 229 4.94 2.54 2.61 2.37 2.55 chr22.dna 3.46·107 5 0.233 0.143 0.188 0.157 0.156 etext99 1.05·108 145 2.32 1.31 1.62 1.16 1.14 gcc-3.0.tar 8.66·107 149 1.91 1.32 1.12 0.949 0.935 howto 3.94·107 196 0.832 0.478 0.496 0.438 0.432 jdk13c 6.97·107 113 1.30 0.708 0.789 0.829 0.755 linux- 2.4.5.tar 1.16·108 255 2.76 1.41 1.45 1.30 1.51 rctail96 1.15·108 93 2.25 1.18 1.20 1.19 1.17 rfc 1.16·108 120 2.28 1.25 1.27 1.20 1.19 sprot34.dat 1.10·108 66 2.12 1.14 1.36 1.13 1.13 w3c2 1.04·108 255 2.30 1.30 1.28 1.30 1.18 OurPSHUFB andPEXT outperformedNAÏVE whilebeingcompatiblewithPC.
  • 16. 16 Result: wavelet matrix § PSHUFB vs.NAÏVE: • 3.0–4.6x § PEXT vs.NAÏVE: • 2.5–4.5x § Oursv.s.PC: • 1.1–1.9x 1st winner 2nd winner n σ NAÏVE PC PS PSHUFB PEXT dblp.xml 2.96·108 97 6.20 3.00 3.01 2.02 2.05 dna 4.04·108 16 5.88 2.43 2.70 1.29 1.30 english 2.21·109 238 57.0 27.2 28.5 15.8 15.9 pitches 5.58·107 132 1.37 0.684 0.709 0.429 0.547 proteins 1.18·109 27 22.4 8.29 8.41 6.36 6.43 sources 2.11·108 229 5.81 2.54 2.95 1.57 1.59 chr22.dna 3.46·107 5 0.385 0.143 0.164 0.130 0.092 etext99 1.05·108 145 3.01 1.27 1.34 0.803 0.771 gcc-3.0.tar 8.66·107 149 2.18 1.05 1.29 0.633 0.639 howto 3.94·107 196 1.011 0.478 0.494 0.295 0.298 jdk13c 6.97·107 113 1.57 0.708 0.705 0.500 0.506 linux- 2.4.5.tar 1.16·108 255 3.14 1.41 1.64 0.872 1.11 rctail96 1.15·108 93 2.37 1.22 1.19 0.779 0.792 rfc 1.16·108 120 2.65 1.30 1.28 0.791 0.811 sprot34.dat 1.10·108 66 2.81 1.14 1.38 0.905 0.855 w3c2 1.04·108 255 2.63 1.54 1.27 0.80 0.81 Medianof5elapsedtimes(insec)forconstructing awaveletmatrixwithoutrankandselectindexes. Our PSHUFB and PEXT outperformed all others: NAÏVE, PS, and PC.
  • 17. 17 Conclusion § Practical wavelet tree construction using PEXT/PSHUFB. • Based on [Munro+, SPIRE’14] and [Babenko+, SODA’15] § Experiments on real datasets: • Wavelet tree: Faster than NAÏVE and competitive w/ SOTA: prefix sorting (PS) and prefix counting (PC). • Wavelet matrix: Faster than NAÏVE, PS, and PC. § Future work • Exploit moreparallelismin CPUcoresand/orSIMD registers.
  翻译: