CARVIEW |
Select Language
HTTP/2 200
date: Wed, 08 Oct 2025 21:42:37 GMT
content-type: text/html; charset=utf-8
cache-control: max-age=0, private, must-revalidate
cf-cache-status: DYNAMIC
link: ; rel=preload; as=style; nopush,; rel=preload; as=script; nopush,; rel=preload; as=style; nopush,; rel=preload; as=script; nopush,; rel=preload; as=script; nopush
nel: {"report_to":"heroku-nel","response_headers":["Via"],"max_age":3600,"success_fraction":0.01,"failure_fraction":0.1}
referrer-policy: strict-origin-when-cross-origin
report-to: {"group":"heroku-nel","endpoints":[{"url":"https://nel.heroku.com/reports?s=CBYp0Wthcm9anLSUaxqPe22VCKfk6dmKZWZUxqqXcFo%3D\u0026sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d\u0026ts=1759959757"}],"max_age":3600}
reporting-endpoints: heroku-nel="https://nel.heroku.com/reports?s=CBYp0Wthcm9anLSUaxqPe22VCKfk6dmKZWZUxqqXcFo%3D&sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d&ts=1759959757"
server: cloudflare
strict-transport-security: max-age=0; includeSubDomains
vary: Accept,Accept-Encoding
via: 2.0 heroku-router
x-content-type-options: nosniff
x-permitted-cross-domain-policies: none
x-request-id: 8e6d9ac3-ece2-4511-54be-e81597d1135c
x-runtime: 0.127278
x-xss-protection: 0
content-encoding: gzip
set-cookie: _secure_speakerd_session=7NBP2iJnIsEMeC8TjKpGVEQjaa%2B6I6HCmdYr6aZtGAN%2Fy2IW2BFJmwD0cwOStFk%2Btj2ZsA%2By7ttnIxiIZ3tMwSE4Shog9lKo%2BuJ1J6DbejshndJclsd9orBWK2Z%2BqIoIdztzeuAB%2BBtIleGEUCBdy3bG%2Fuw9%2FSC4jAodIQecSmO%2Fper19p96%2BGE2m0c54IICv0NMYbgTmSCeYGtp7smZAQ17IN9rEe1UvvVSvBSJzzG%2BmzUFqyj7tILA8FXUDYOc3BkRknb4VFbW2rRS59bvcFvAZD%2FERYW0u%2BZkyqsJ6zkeJzPyrnJKo7%2FExmKVL1rU2SKyiPu3LEICbBSAKpM5YPDY8YMUYmdHQJpD69tryyQW1wL%2BNYzUvFZcuAj9xSJSidhAe0ipSAV1evYI%2Bqg%3D--0z9SoyDkVoAkJH9%2F--wQ%2FTrftqkzuKHrpWOllZzA%3D%3D; HttpOnly; SameSite=Lax; Secure; Path=/; Expires=Wed, 22 Oct 2025 21:42:37 GMT
cf-ray: 98b8cf22ad52e9c3-BLR
Meltria: マイクロサービスにおける
異常検知・原因分析のための
データセットの動的生成システム / Meltria in IOTS2021 - Speaker Deck
Meltria: マイクロサービスにおける 異常検知・原因分析のための データセットの動的生成システム / Meltria in IOTS2021
https://www.iot.ipsj.or.jp/symposium/iots2021-program/
(9) Meltria:マイクロサービスにおける異常検知・原因分析のためのデータセットの動的生成システム
◎坪内佑樹(さくらインターネット, 京都大学), 青山真也(さくらインターネット)
Yuuki Tsubouchi (yuuk1)
November 26, 2021
More Decks by Yuuki Tsubouchi (yuuk1)
Other Decks in Research
Featured
Transcript
-
Meltria: ϚΠΫϩαʔϏεʹ͓͚Δ ɹɹɹɹҟৗݕɾݪҼੳͷͨΊͷ ɹɹɹɹσʔληοτͷಈతੜγεςϜ ௶ ༎थʢ͘͞ΒΠϯλʔωοτɾژେֶʣ ੨ࢁ ਅʢ͘͞ΒΠϯλʔωοτʣ
ใॲཧֶձ ୈ14ճΠϯλʔωοτͱӡ༻ٕज़γϯϙδϜʢIOTS2021ʣ 202111݄26 -
2 ΫϥυΞϓϦέʔγϣϯͷෳࡶԽͱAIOps ϞϊϦε ΞʔΩςΫνϟ ϚΠΫϩαʔϏε ΞʔΩςΫνϟ ‣ มߋසͷ૿େ
‣ ґଘؔͷෳࡶੑ ‣ ࢹσʔλྔͷ૿େ ΦϖϨʔλʔͷܦݧײ ʹཔΔΠϯγσϯτରԠ͕Խ ೝෛՙͷ ૿େ ౷ܭੳɾػցֶशͰղܾʢAIOpsʣ ։ൃऀͷίʔυ ͷมߋ͕͍͠ ػೳ୯ҐͰ ࢄ [Soldani 21]: Soldani, J. and Brogi, A., Anomaly Detection and Failure Root Cause Analysis in (Micro) Service- Based Cloud Applications: A Survey, arXiv preprint 2021. [Soldani 21] -
3 ධՁ༻ͷӡ༻σʔληοτͷ՝ ɾAIϞσϧͷֶशධՁͷͨΊʹɺҟৗΛؚΉσʔληοτ͕ඞཁ ɾاۀϓϥΠόγʔηΩϡϦςΟͷ߹্ɺɹɹɹɹɹɹɹɹɹɹɹɹɹ ӡ༻σʔλͷެ։ʹফۃత [Loghub 20]: He, Shilin, et
al. "Loghub: A large collection of system log datasets towards automated log analytics." arXiv preprint 2020. [Exathlon 20]: Jacob, Vincent, et al. "Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series." arXiv preprint 2020. [LogAD 21]: Zhao, Nengwen, et al. "An empirical investigation of practical log anomaly detection for online service systems." ACM ESEC/FSE. 2021. ެ։σʔληοτ ɾݶΒΕͨҟৗύλʔϯͷΈ ɾಛఆͷσʔληοτʹաద ߹͢ΔڪΕ͕͋Δ [Loghub 20] [Exathlon 20] ɾաద߹Λආ͚ΔͨΊʹɺ๛ͳ ҟৗύλʔϯʹΑΔֶशධՁ͕ ඞཁ ɾ͋ΒΏΔҟৗύλʔϯΛแͨ͠ σʔληοτͷࣄલ࡞͍͠ [LogAD 21] -
4 ఏҊɿσʔληοτΛಈతʹੜ͢ΔγεςϜ ಈత ɾҟৗͷύλʔϯɺσʔλܭଌ͕݅Մม ҟৗͷ࠶ݱ݅ σʔλܭଌ݅ ੜγεςϜ ೖྗ ग़ྗ
σʔλ ηοτ աద߹Λ ൃݟɾճආ ɾAIϞσϧͷֶशධՁͷͨΊʹσʔλʹʮҟৗͷ༗ແͱҐஔʯͱ ͍ͬͨจ຺ʢσʔλϥϕϧʣ͕ඞཁ ɾੜ͢ΔͨͼʹϥϕϦϯά͢Δ࿑ྗΛݮ͍ͨ͠ => ࣗಈԽͷఏҊ -
5 ຊݚڀͷߩݙ ࣮ݧͷ݁ՌɺϥϕϦϯάͷਖ਼֬ੑ͕85% 1. σʔληοτΛಈతʹੜ͢Δͱ͍͏৽نੑ 2. ҟৗͷ༗ແͱҐஔͷϥϕϦϯάͷࣗಈԽ AIOpsจ຺ͰσʔληοτʹࣗಈϥϕϦϯά͢Δٞ΄ͲΜͲͳ͍ σʔληοτͱ͍͏ͷ௨ৗɺ੩తͰ͔ͭ࡞ʹख͕͔͔ؒΔͷ ։ൃͨ͠ϓϩτλΠϓʹΑΔ࣮ݧͷ݁Ռɺظ௨Γʹಈ࡞͠ͳ͍2छྨέʔε
ಈతੜʹΑΓɺաద߹ΛͲͷఔ͛Δͷ͔ͷධՁ͕ࠓޙඞཁ AIOpsΞϓϩʔνͷҰൠద༻ੑΛ֬อ͍ͯͨ͘͠Ίʹ σʔλʹண͢ΔݚڀʹҰา౿Έग़ͨ͠ -
6 ӡ༻σʔληοτͷείʔϓͷઃఆ ӡ༻σʔλͷछผ ରΞϓϦέʔγϣϯ ϝτϦοΫ: ࣌ܥྻͷσʔλ ϩά τϨʔε: ϦΫΤετͷ࣮ߦܦ࿏ Sock
Shop: খن(8αʔϏε) Train Ticket: தن(41αʔϏε) Pymicro: γϛϡϨʔλ(16αʔϏε) ϚΠΫϩαʔϏεͷҟৗݕɾݪҼੳʹؔ͢Δ11݅ͷจΛௐࠪ ೖ͢Δނোͷछྨ 7݅ 5݅ 3݅ 2݅ 2݅ 5݅ ܭࢉػࢿݯʹؔ͢Δނো αʔϏεؒϦΫΤετͷ൱ͱԆ ϚΠΫϩαʔϏεಛ༗ͷނো 7݅ 2݅ 1݅ ϝτϦοΫ ܭࢉػࢿݯʹؔ͢Δނো Sock Shop ࠷ଟͷέʔεʹείʔϓΛઃఆ -
7 ؔ࿈ٕज़ͱͷൺֱ Chaos Engineering ࢄγεςϜ͕༧ظͤ͵ࣄଶʹ͑ΒΕΔ͔Ͳ͏͔ͷ֬ূΛ ಘΔͨΊͷݕূͷن [Basiri 16]: Basiri,
A., et al., Chaos Engineering, IEEE Software, 2016. [Basiri 16] Chaos EngineeringΛ࣮ફ͢ΔπʔϧɿLitmusChaos, Chaos Mesh ‣ ނোೖ: ܭࢉػࢿݯʹؔ͢ΔނোΛαϙʔτ ‣ CPUɺϝϞϦɺσΟεΫͷա༻ɺύέοτϩεͳͲ ‣ ނোೖͷεέδϡʔϦϯά࣮ߦ ӡ༻σʔλͷཧϥϕϦϯάͳͲͷ Chaos Engineeringࣗମʹඞཁͷͳ͍ػೳؚ·Εͳ͍ -
8 σʔληοτͷಈతੜγεςϜͷઃܭ ୈҰઃܭج४ ୈೋઃܭج४ σʔλཧΛؚΊͨނোೖ ͷεέδϡʔϦϯά ϥϕϦϯάͷࣗಈԽ ‣ ނোೖͷӨڹͷ༗ແ ‣
ఆ֎ͷҟৗͷ༗ແ σʔληοτʹؚ·ΕΔܥྻ͝ͱʹ ނোೖ݅ σʔλܭଌ݅ ੜϫʔΫϑϩʔ ೖྗ ग़ྗ σʔληοτ ϥϕϦϯά ‣ Chaos EngineeringΛ֦ு ‣ ಉҰ࣌ؒଳʹނোೖͷॏෳͳ͠ ‣ 1ೖʹରԠ͢ΔσʔλΛඥ͚Մೳ ‣ ਖ਼ৗͱҟৗσʔλΛ྆ํؚΉ Λ༩ -
9 ୈҰͷઃܭج४ɿނোೖͷεέδϡʔϦϯάཁ݅ σʔλͷ جຊ୯Ґ Component A Component B Component
C Time Slot Fault α Injected Span Fault β Injected Span Component A Recovery time ίϯϙʔωϯτ × ނো ͷΈ߹Θ͚ͤͩೖ ਖ਼ৗ࣌σʔλඞཁͳͷͰ ػ ̍εϩοτ̍ނো -
10 ୈҰͷઃܭج४ɿγεςϜߏ Workflow Scheduler Operational Data Stoage Load Generator Target
Application 1. Inject faults Datasets Repositorry 2. Pick latest data to datasets 3. Wait until the application recovers ᶅ ࣍ͷinjection࣌ؒ ɹɹɹɹɹ·Ͱػ ᶃ ނোΛೖ ᶄ εϩοτͷσʔλΛ ࠾औ -
11 ୈೋͷઃܭج४ɿσʔληοτͷϥϕϦϯά (1) ϥϕϦϯάରͷܥྻબ (2) ֤ܥྻͷྨ ΞϓϦέʔγϣϯ ϨϕϧϝτϦοΫ s1
s2 s3 ނোೖͨ͠ϚΠΫϩ αʔϏεϨϕϧͷ ϝτϦοΫ ނোೖʹ࠷ؔ ࿈͢ΔϝτϦοΫ NOT_FOUND FOUND_INSIDE_ANOMALY (ظ͞ΕΔঢ়ଶ) FOUND_OUTSIDE_ANOMALY ҟৗͷܦ࿏ͷΈϥϕϦϯά ҟৗͷ༗ແͱҐஔͰ3ྨ -
12 ୈೋͷઃܭج४ɿ3ϥϕϧͷྨख๏ ਖ਼نͷ68-95-99ଇʢ3γάϚଇʣ 2ඪ४ภࠩͷ֎ͷൣғͷܥྻΛҟৗͱ͢Δ Wikipedia “68–95–99.7 rule” ΑΓҾ༻ ҟৗͱఆ͞ΕͨσʔλͷҐஔʹΑͬͯ
FOUND_INSIDE_ANOMALY ·ͨ FOUND_OUTSIDE_ANOMALY ͕ܾ·Δ 68%, 95%, 99.7%ͷ͕ͦΕͧΕฏۉͷ 1, 2, 3ඪ४ภࠩҎʹऩ·͍ͬͯΔ -
13 ධՁ࣮ݧͷઃఆ σʔλͷܭଌ݅ ɾσʔλऔಘͷִؒ 15 ඵ ɾεϩοτͷظؒΛ 30 ɾSock
Shopͷ8छྨͷίϯςφ ɾCPUա༻ͱϝϞϦϦʔΫ ɾ5ճͣͭೖ ܭ90ճͷނোೖ s1 s2 s3 ܥྻͷબ front-endίϯςφͷ ฏۉϨεϙϯελΠϜ ނোೖαʔϏεͷ ฏۉϨεϙϯελΠϜ ނোؔ࿈ͷϝτϦοΫ ɾϢʔβۭؒͷCPUར༻ ɾϝϞϦ༻ྔ -
14 ୈҰઃܭج४ͷධՁ 1. ނোೖʹࣦഊ͠ɺނো͕ൃੜ͠ͳ͔ͬͨ ɾނোೖͷࣦഊݪҼௐࠪத ɾނোೖͷࣦഊΛݕ͠ɺࣦഊΛϥϕϧ͚͢Δඞཁ͕͋Δ 2. લͷނোೖͷӨڹ͕ࠞͬͨ͟ ɾΞϓϦέʔγϣϯͷճ෮Λอূ͢Δػߏ͕ඞཁͱͳΔ ఆྔతͳධՁະ࣮ࢪ
ʮࢹʹΑΔྨʯ͕FOUND_INSIDE_ANOMALYʹҰக͠ͳ͔ͬͨέʔε ࣍ͷ2छྨ s1, s2, s3͕શͯFOUND_INSIDE_ANOMALYͱͳΔ͜ͱΛظ -
15 ୈೋઃܭج४ͷධՁɿϥϕϦϯάͷਖ਼֬ੑ 257ݸͷܥྻΛࢹͰྨͨ͠ ͷΛਖ਼ղͱͨ͠ ޡྨ͞Εͨྫ ɾނোೖʹࣦഊ (a), (b) ɾਖ਼ৗ࣌ʹεύΠΫมಈ (c)
ɾલճͷೖӨڹͷࠞೖ (d) 85%ͷਖ਼ղΛࣔͨ͠ -
17 ·ͱΊ ๛ͳҟৗύλʔϯʹΑΓաద߹Λճආ͢ΔͨΊʹɺσʔληοτͷ ಈతੜγεςϜMeltriaΛఏҊ ᶃ ʮσʔλཧΛؚΊͨʯނোೖͷεέδϡʔϦϯά ᶄ ਖ਼نͷܦݧଇʹΑΔʮҟৗͷ༗ແͱҐஔʯͷϥϕϦϯάͷࣗಈԽ ຊ࣮GitHubʹͯެ։ࡁΈ
https://github.com/ai4sre/meltria 90ճͷނোೖͷ࣮ݧ ᶃ ظ͞ΕΔҟৗ͕ى͖ͳ͔ͬͨέʔε2छྨɻMeltriaʹಈ࡞อূΛՃ ͑ͯରԠՄೳ ᶄ ϥϕϦϯάͷਖ਼֬ੑ85%ɻޡྨͷཁҼɼมಈ͕খ͍͞έʔε -
18 ࠓޙͷల ɾϝτϦοΫҎ֎ͷσʔλछผͷσʔληοτੜ ɾΑΓنͷେ͖͍ΞϓϦέʔγϣϯͷαϙʔτ ɾଟ༷ͳछྨͷނোͷαϙʔτ ɾσʔληοτͷੜ࣌ؒͷॖ ػೳͷ֦ॆ ֶज़ੑͷ্ ɾಈతੜʹΑΓͲͷఔաద߹Λ͛Δͷ͔ͷධՁ ɾఏҊ͢ΔϥϕϦϯά͕Ͳͷఔ༗༻ͳͷ͔ͷධՁ
ҙͷ࣮ΞϓϦέʔγϣϯʹରͯ͠ɺ͞·͟·ͳAIOpsख๏Λ ࣗಈධՁ͢ΔγεςϜൃల