CARVIEW |
Select Language
HTTP/2 200
date: Sun, 12 Oct 2025 05:08:19 GMT
content-type: text/html; charset=utf-8
cache-control: max-age=0, private, must-revalidate
cf-cache-status: DYNAMIC
link: ; rel=preload; as=style; nopush,; rel=preload; as=script; nopush,; rel=preload; as=style; nopush,; rel=preload; as=script; nopush,; rel=preload; as=script; nopush
nel: {"report_to":"heroku-nel","response_headers":["Via"],"max_age":3600,"success_fraction":0.01,"failure_fraction":0.1}
referrer-policy: strict-origin-when-cross-origin
report-to: {"group":"heroku-nel","endpoints":[{"url":"https://nel.heroku.com/reports?s=o4epTrShoeZGTi4FWK3twU1nnSTizQJZWJ6zmK%2F8ZK4%3D\u0026sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d\u0026ts=1760245699"}],"max_age":3600}
reporting-endpoints: heroku-nel="https://nel.heroku.com/reports?s=o4epTrShoeZGTi4FWK3twU1nnSTizQJZWJ6zmK%2F8ZK4%3D&sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d&ts=1760245699"
server: cloudflare
strict-transport-security: max-age=0; includeSubDomains
vary: Accept,Accept-Encoding
via: 2.0 heroku-router
x-content-type-options: nosniff
x-permitted-cross-domain-policies: none
x-request-id: 8477ef5b-6142-b81a-0b60-de552355235d
x-runtime: 0.134847
x-xss-protection: 0
content-encoding: gzip
set-cookie: _secure_speakerd_session=cEq%2Fvw2lDlZ4f1f%2Bk4E88EaXUlcotQygwaN2ROP0VXhuDo%2F0H5a%2BKdk%2FMsZz0%2FyD8tuapVClFhVYBc0xwEkQniJpFVyVoAFUuYoX0wwP7oaVaueH8dhzzcmXWrRUp3%2FvXjfWgMeCLBpGv476FonznNxwrVfaVN%2FNqYsDm2rERbko%2Bn1uCPziNgNIA3f3k%2BfqwvEvhpwXeyPGbMB%2Fld%2Fq%2BihXUkyq1V%2BbE1ykLZum4zyqFnUwBjeU9XRoMkRJwnkMInyPm5myP%2Bx7hivssAGf2HolvbAvrG%2BTIP%2F6vyEZ1eu0Wo02WN8Mm5FhQOFilvuP%2FxLsWFTxN0I2WfkY%2BY6GqX9UkBOd3IMDssgQvzqf4vfai94ikH5wd7zrL8bA5GR4RmshbgmgK9sT12Oi%2FIwIX4SV--nDbZ1GADKGciW6f%2B--2udCYBzaq12ir2szAGyTtA%3D%3D; HttpOnly; SameSite=Lax; Secure; Path=/; Expires=Sun, 26 Oct 2025 05:08:19 GMT
cf-ray: 98d41420ad0bb9d7-BLR
MetricSifter:クラウドアプリケーションにおける故障箇所特定の効率化のための多変量時系列データの特徴量削減 / FIT 2024 - Speaker Deck
MetricSifter:クラウドアプリケーションにおける故障箇所特定の効率化のための多変量時系列データの特徴量削減 / FIT 2024
FIT 2024トップコンファレンスセッション
https://www.ipsj.or.jp/event/fit/fit2024/abstract/data/html/event/event_TCS7-3.html
【タイトル邦題】 MetricSifter:クラウドアプリケーションにおける故障箇所特定の効率化のための多変量時系列データの特徴量削減
坪内 佑樹(さくらインターネット株式会社 さくらインターネット研究所 上級研究員)
【原発表の書誌情報】 Tsubouchi, Y., Tsuruta, H.: MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications, IEEE Access, Vol.12, pp.37398-37417 (2024).
【概要】 大規模クラウドアプリケーションにおける機械学習を用いた故障特定の研究が盛んである。本研究では、故障関連の監視メトリクスを正確に特定するための時系列データの特徴量削減フレームワークMetricSifterを提案する。本手法は、監視メトリクスの故障起因変化点の時間的近接性に注目し、既存の故障特定法を高精度かつ高効率化する。
Yuuki Tsubouchi (yuuk1)
September 06, 2024
More Decks by Yuuki Tsubouchi (yuuk1)
Other Decks in Research
Featured
Transcript
-
2 1. Introduction 2. Failure-oriented Feature Reduction Framework 3. Evaluation
4. Conclusion ࣍ Tsubouchi, Y., Tsuruta, H.: MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications, IEEE Access, Vol.12, pp.37398-37417 (2024). ʲݪൃදͷॻࢽใʳ Ҏ߱ɺ[Tsubouchi+,ACCESS24]ͱදه -
3 1. Introduction 2. Failure-oriented Feature Reduction Framework 3. Experiment
and Discussion 4. Conclusion Introduction -
ςϨϝτϦγεςϜ ϝτϦΫε ࣌ܥྻͷ σʔλ ΦϖϨʔλʔ 4 ΫϥυͷނোಛఆͷδϨϯϚ Ϋϥυ Πϯλʔωοτ ΞϓϦέʔγϣϯ
ࢄγεςϜͱͯ͠ෳࡶԽ োͷஅͷͨΊͷ ςϨϝτϦ͕ॏཁͱͳ͍ͬͯΔ ࣗಈނোಛఆ[9-24] ಛఆࣗಈԽ ᶃ ϝτϦΫε͕૿େ ෆཁσʔλࠞೖʹΑΔ ಛఆਫ਼ͱԼ δϨϯϚ ᶄ [25] -
5 δϨϯϚͷղফʹ͚ͯ ࣗಈނোಛఆ[9-24] ϝτϦΫε ࣌ܥྻͷ σʔλ ΦϖϨʔλʔ ಛݮ ڭࢣͳ͠ɾਖ਼ৗ࣌σʔλΛڭࢣͱ͢Δ ܰྔͳ౷ܭղੳɺػցֶशͳͲ
ଟมྔ࣌ܥྻσʔλ ෆཁͳϝτϦΫεʢ࣌ܥྻʣͷ ݸΛݮ ࠜຊݪҼ ϥϯΩϯά [25,26] -
6 ࣌ܥྻதͷҟৗͷ༗ແʹண طଘͷಛݮͱͦͷ՝ ࣌ܥྻͷྨࣅੑؔ࿈ੑʹண [14,23,26] [9,12,16,25] ୯Ұͷάϩʔόϧͳʮোʯ ͷؔ࿈ੑΛଊ͍͑ͨ ҟৗੑʹجͮ͘ݮ ੑʹجͮ͘ݮ
ো࣌ؒ֎ͷҟৗΛ ݮͰ͖ͳ͍ʢِӄੑʣ োؔ࿈ϝτϦΫεؒͰྨ ࣅ͢Δͱޡআʢِཅੑʣ ϝτϦΫεϩʔΧϧͷ ҟৗੑੑʹىҼ -
7 ؍ͱԾఆ [Tsubouchi+,ACCESS24] FIGURE 1. ΑΓసࡌ ԣ࣠160͕োൃੜ࣌ࠁ ނোىҼͷมԽ ͍ۙ࣌ؒʹݱΕΔ ؍
ϩʔΧϧΠϕϯτ มԽ͕࠷ภΔ࣌ؒൣғ͕ɺোظؒͱͳΔ Ծఆ άϩʔόϧΠϕϯτ -
10 MetricSifterͲͷΑ͏ʹಈ࡞͢Δ͔ʁ [Tsubouchi+,ACCESS24] FIGURE 5. ΑΓసࡌ STEP 2:มԽ࣌ؒͷ ΛجʹηάϝϯτΛׂ STEP
1:࣌ܥྻ͝ͱʹɺ ނো༝དྷͷมԽީิ Λݕग़ STEP3: ࠷େີͷηά ϝϯτΛબ -
11 STEP 1: ୯มྔ࣌ܥྻͷมԽݕग़ ᶃ ίετؔɿݕग़͢ΔมԽͷछྨΛબ ઃܭํɿυϝΠϯʹదͨ͠ɺมԽݕग़ͷطଘख๏[48]Λબ͢Δ ᶄ ୳ࡧ๏ɿมԽΛ୳ͨ͢ΊͷΞϧΰϦζϜ ᶅ
ϖφϧςΟ߲ɿݕग़͢ΔมԽͷʹ੍Λ͔͚Δ L2 ʢฏۉγϑτʣ PeltɿݫີղΛٻΊΔ͕͖݅ͰࢬמΓߴԽՄ BICʹج͖ͮώϡʔϦεςΟοΫʹܾఆɻͨͩ͠ಠࣗͷዞҙతͳ ΛՃɻ ω -
12 ᶃ ີͷਪఆ Χʔωϧີਪఆ๏ʢKDEʣΛ༻͍ͯ ࢄܕͷີΛੜ STEP 2: มԽͷີਪఆͱηάϝϯςʔγϣϯ [Tsubouchi+,ACCESS24] FIGURE
6. ΑΓసࡌ STEP 3: ࠷େͷηάϝϯτ ͱͯ͠બ ᶄ ηάϝϯςʔγϣϯ ہॴ࠷খʹڥքઢΛҾ͘ ʢFig.610ݸͷηάϝϯτʹׂʣ -
13 1. Introduction 2. Failure-oriented Feature Reduction Framework 3. Evaluation
4. Conclusion Evaluation -
14 Q1: ಛྔݮਫ਼Ͳͷఔྑ͍ͷ͔ʁ Q2: ނোಛఆੑೳΛͲͷఔ্ͤ͞Δ͔ʁ Q3: ύϥϝʔλʹͲͷఔහײ͔ʁ (Parameter Sensitivity) Q4:
ఏҊ๏ͷ֤STEP͕Ͳͷఔੑೳʹد༩͢Δ͔ʁ (Ablation Study) ධՁ -
15 σʔληοτ [Tsubouchi+,ACCESS24] TABLE 4. Λվม ߹σʔλ [58]Λ༻͍ͯোΛγϛϡϨʔτ͠ ͨଟมྔ࣌ܥྻͱDAGΛੜɻ ࣮ূσʔλ
ΞϓϦ αʔϏε ނো ϝτϦΫε SS-small Sock Shop(SS) 7 90 64 SS-medium 184 SS-large 1312 TT-small Train Ticket(TT) 41 42 383 TT-medium 1349 TT-large 9458 ఆ൪ͷϕϯνϚʔ ΫΞϓϦʹɺCPU ·ͨϝϞϦͷա ༻ނোΛೖ ͯ͠࠾औɻ ϊʔυ Τοδ 50 100 200 100 500 700 D50,100 sim D50,200 sim D200,500 sim D200,700 sim -
16 Q1: ಛྔݮਫ਼Ͳͷఔྑ͍ͷ͔ʁ (c) Balanced accuracy [Tsubouchi+,ACCESS24] FIGURE 7. (c)
ΑΓసࡌ MetricSifterͷฏۉਫ਼0.981Ͱ ࠷ྑΛࣔͨ͠ɻ ݮάϧʔϓɺ૯ͯ͡ είΞͱͳͬͨɻ Ͱ࣌ܥྻ͕ྨࣅɾ૬ؔ ͢Δͷ͕আ͞ΕΔͨΊɻ MA ∪ MB -
PC+HT ϥϯμϜબ 17 Q2: ނোಛఆੑೳΛͲͷఔ্ͤ͞Δ͔ʁ Ұ෦ൈਮ ૯߹ධՁɹ MetricSifter͕ ཧख๏ʹ ͍ۙਫ਼Λୡ
ख๏ ਫ਼ උߟ Ideal 0.344 ཧ MetricSifter 0.299 ࠷ྑ NSigma 0.241 ࣍ None 0.175 w /o ಛݮ શނোಛఆ๏ͱͷΈ߹ͤʹ ର͢Δtop-5ਫ਼ͷฏۉ -
18 Q2: ࣮ূσʔληοτ [Tsubouchi+,ACCESS24] FIGURE 11. (a) ΑΓҰ෦ൈਮͯ͠సࡌ -small SS
64 metrics όʔ͕ਫ਼ ંΕઢ͕࣮ߦ࣌ؒ - top-5ਫ਼MetricSifter͕࠷ྑͰɺ࣮ߦޮҟৗੑݮΑΓߴ͍ - ࣮ߦ࣌ؒੑݮʢHDBS-SBD/HDBS-Rʣ͕࠷ྑ͕ͩਫ਼࠷͍ -
19 Q2: ࣮ূσʔλৄࡉʢେن >100 metricsʣ -medium SS -large SS -small
TT -medium TT 184 metrics 1312 383 1349 [Tsubouchi+,ACCESS24] FIGURE 11. (b) ΑΓҰ෦ൈਮͯ͠సࡌ RCDͷΈ͕ݱ࣮తͳ࣌ؒʢ3600ඵҎʣͰॲཧΛऴ͑ͨ - ଞɺނোಛఆΞϧΰϦζϜʹฒྻੑ͕ͳ͍ͨΊ ϝτϦΫε>1000Ͱɺಛݮͷ༗ແʹ͔͔ΘΒͣɺ ඇৗʹ͍ਫ਼ͱͳͬͨ -
20 1. Introduction 2. Failure-oriented Feature Reduction Framework 3. Experiment
and Discussion 4. Conclusion Conclusion -
21 ɾಛݮͷఆྔతͳൺֱධՁΛߦͬͨॳͷݚڀ ɾϩʔΧϧͷมԽ͔ΒάϩʔόϧͳোΛଊ͑ΔಛݮMetricSifterΛఏҊɻ ɾ߹σʔλͰɺ0.981ͷ࠷ྑͷਖ਼ղͱͳΓɺނোಛఆਫ਼Λ24%্ɻ ɾ࣮ূσʔλͰނোಛఆͷਫ਼ͱޮͷ྆ํ·͍ͨͣΕ͔Λ্ͤͨ͞ɻ ·ͱΊ ɾΑΓੵۃతʹݮ͢ΔΛఆٛ͠ɺ1000Ҏ্ͷϝτϦΫεΛ100ϝτϦ Ϋεఔ·Ͱݮ͢Δ͜ͱΛࢦ͢ʢݱঢ়ͷݮ40-60%ఔʣ ɾނোಛఆ๏ͷSOTAଞͷެ։σʔληοτΛ༻͍ͯධՁ͢Δ ࠓޙͷݚڀ
ίʔυͱσʔληοτɿhttps://github.com/ai4sre/metricsifter -
23 ಛྔݮͷఆٛ [Tsubouchi+,ACCESS24] FIGURE 2. ΑΓసࡌ ނোʢFaultʣൃੜޙɺϝτϦΫεཻͰͷҟ ৗͷൖϞσϧ ɿతʹӨڹ͕ݱΕͨϝτϦΫε ɿؒతʹӨڹ͕ݱΕͨϝτϦΫε
ɿແӨڹͷϝτϦΫε MA MB MC োΛݕͨ͠ΒɺͰ͖ΔݶΓૣ͘ɺ Λಛఆ͢Δ͜ͱɻ MA ∪ MB -
26 ධՁࢦඪ ಛݮ๏ ނোಛఆ๏ Specificity Recall Balanced Accuracy (BA) =
(Specificity + Recall)/2 ຊυϝΠϯͷ ఆ൪ධՁࢦඪ AC@k AVG@5 top-kʹਖ਼ղؚ͕·ΕΔ͔ͷਫ਼ ( ) ͷࢉज़ฏۉ AC@k 1 ≤ j ≤ 5 ޡআ͍ͯ͠ͳ͍͔ʁ աݮ͍ͯ͠ͳ͍͔ʁ ྨҰൠͷධՁࢦඪ -
27 ɾਖ਼ৗੑݮɿNSigma, BIRCH, K-S test, FluxInfer-AD ɾੑݮɿHDBSCAN + SBD, HDBSCAN
+ ϐΞιϯ૬ؔ ɾཧख๏ɿIdealʢਖ਼ղBalanced Accuracy͕100%ʣ ϕʔεϥΠϯ ಛݮ๏ ނো ಛఆ๏ ɾϥϯμϜબʢRSʣ ɾҟৗϕʔεɿ -Diagnosis ɾҟৗൖϕʔεɿҼՌάϥϑߏங+είΞϦϯά ɾPC+PageRank, PC+HT, LiNGAM+PageRank, LinGAM + HT, RCD ϵ -
28 Q3: ύϥϝʔλʹͲͷఔහײ͔ʁʢParameter Sensitivity) [Tsubouchi+,ACCESS24] FIGURE 9. ΑΓసࡌ : มԽݕͷϖφϧςΟ߲ͷ
ॏΈʢSTEP 1ʣ ω 2.5ۙͰϐʔΫΛͱΓͷݮগ ʹහײͰ͋Δ ਫ਼ͷӨڹ͍ : ਪఆີؔͷฏԽ ʢSTEP 2ʣ h -
29 Q4: ఏҊ๏ͷ෦Ґ͕Ͳͷఔੑೳʹد༩͢Δ͔ʁ [Tsubouchi+,ACCESS24] FIGURE 10. ΑΓసࡌ దͳύϥϝʔλʔͰ͋Ε ɺਫ਼ࠩখ͍͞ STEP1ʢมԽݕग़ʣͷύϥ
ϝʔλ ͕͍ͱਫ਼͕Լ ω ͔͠͠ɺSTEP2/3ʹΑΓਫ਼ Λճ෮Ͱ͖͍ͯΔ ߹ͷ͖Ε͍ͳσʔλͰ ɺมԽݕग़ਫ਼͕ߴ͢ ͗ΔͨΊ