Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf

Page created by Steve Medina
 
CONTINUE READING
Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf
© 2020 SPLUNK INC.

                                   © 2020 SPLUNK INC.

Self-Healing
with ITSI

Ashish Yadav
Senior Software Developer | TIAA
Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf
Forward-     During the course of this presentation, we may make forward‐looking statements regarding
             future events or plans of the company. We caution you that such statements reflect our

Looking      current expectations and estimates based on factors currently known to us and that actual
             events or results may differ materially. The forward-looking statements made in the this

Statements   presentation are being made as of the time and date of its live presentation. If reviewed after
             its live presentation, it may not contain current or accurate information. We do not assume
             any obligation to update any forward‐looking statements made herein.

             In addition, any information about our roadmap outlines our general product direction and is
             subject to change at any time without notice. It is for informational purposes only, and shall
             not be incorporated into any contract or other commitment. Splunk undertakes no obligation
             either to develop the features or functionalities described or to include any such feature or
             functionality in a future release.

             Splunk, Splunk>, Data-to-Everything, D2E and Turn Data Into Doing are trademarks and registered trademarks of Splunk Inc. in the United States
             and other countries. All other brand names, product names or trademarks belong to their respective owners. © 2020 Splunk Inc. All rights reserved
Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf
© 2020 SPLUNK INC.

    Ashish Yadav
Senior Software Developer | TIAA
Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf
Ashish
        Who’s this dude?
        • Senior Splunk Developer @ TIAA
        • 9+ Years of experience on Splunk
        • Expertise on Splunk Enterprise Security & IT
          Service Intelligence
        • Internet of Things (IoT) Splunk Augmented
          Reality
        • Avid Blogger & Automation Enthusiast
        • Author of Book on Splunk – Advanced Splunk

© 2020 SPLUNK INC.
Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf
© 2020 SPLUNK INC.

Where do I Work ?
Part of IT Operation Stability & Innovation Team @TIAA

                  Managing & Maintaining              Event Routing &
                    Splunk Infrastructure             Event Aggregation
                    10 TB / Day, Multi-Site, Multi-
                         Clustering Environment

        Infrastructure                                                    Incident
      and Application                                                     Management &
           Monitoring                                                     Event Driven Action
                                                                          6000+ Incidents on an
                                             What does my                 average monthly bases
                                              team do?
Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf
© 2020 SPLUNK INC.

Problem & Challenges
Known issue

                              Maintenance
                                 Mode              Monitoring
                  Siloed
                  Teams
  Alert Storm
    & False
   Positives
                                      Automation
                                         Loop          Cybersecurity
                  Manual                                Challenges
                Remediation
  Unified
Automation
Orchestrator
Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf
© 2020 SPLUNK INC.

Team Brain-storming
What we did?

Incident Data        Identification of   Building Use-            Building
Classification &     Noise Makers        Cases                    Framework
Mining               Top Talkers were    Windows Service, Disk    Creating Self-Healing
Last One Year Data   Identified          Space full, Coherence    Framework
                                         Autoload, VM-F5 Server
                                         Certificate Issue
Self-Healing with ITSI - Ashish Yadav Senior Software Developer | TIAA - Splunk Conf
© 2020 SPLUNK INC.

Event-driven Approach

  Monitor        Triggers       Rules       Actions      Remediation

 Monitor the      Define a    Fetch the   Act based on     Problem is
logs sources   trigger when    Rules          Rules      taken care of!
  via Splunk    something
                    fails
© 2020 SPLUNK INC.

How We Did?
Self-Healing framework

       Rule Based Engine (KV Store)
                    Data Enrichment

             Splunk ITSI
                                      Aggregation            Self -
             Correlation
                                        Policy              Healing
               Search

                    Data Enrichment

       Maintenance Mode (KV Store)

                                              Successful
                                                              Notify
               Self Healing
              Monitoring –
            Correlation Search
                                                            Create
                                               Failed      Incident
© 2020 SPLUNK INC.

What Changed?
How did we address the problems?

             Problem                          Solution

             Alert Storms & False Positives   ITSI Event Aggregation

             Manual Remediation & High MTTR   Automatic Remediation & Low MTTR

             Cybersecurity Challenges         Unified Automation Orchestrator

             Siloed Teams                     Not Standard or Solution but Framework
© 2020 SPLUNK INC.

What Changed?
How did we address the challenges?

              Challenges                                  Solution

              What if Automation was Unsuccessful         End to End Integration Framework

              What if my Framework Failed ?               End to End Monitoring of the Framework

              Automation should not go in infinite loop   Orchestrators & ITSI Event Aggregation

              How to address Maintenance Mode             Rule Based Engine (KV Store)
© 2020 SPLUNK INC.

Results
We can do better!!!

  80%                 800            90%          99%
     Noise            Human Hours    Increased    Reliable and
    Reduction         Saved Yearly   Efficiency    Scalable
© 2020 SPLUNK INC.

The Journey Ahead . . . !!!

Detect     Act                Predict             Prevent
Monitor    Remediation        Pattern Analysis    Generate Warnings
Trigger    MS Teams           Outlier Detection   Impact Prediction
Rules      JIRA               Clustering
© 2020 SPLUNK INC.

Please provide feedback via the

SESSION SURVEY
You can also read