SlideShare a Scribd company logo
Course 2: Programming Issues,
Section 2
Pascal Meunier, Ph.D., M.Sc., CISSP
May 2004; updated July 30, 2004
Developed thanks to the support of Symantec Corporation,
NSF SFS Capacity Building Program (Award Number 0113725)
and the Purdue e-Enterprise Center
Copyright (2004) Purdue Research Foundation. All rights reserved.
Course 2 Learning Plan


   Buffer Overflows
   Format String Vulnerabilities
   Code Injection and Input Validation
   Cross-site Scripting Vulnerabilities
   Links and Race Conditions
   Temporary Files and Randomness
   Canonicalization and Directory Traversal
Learning objectives


   Learn that format strings are interpreted, therefore
    are similar to code
   Understand the definition of a format string
    vulnerability
   Know how they happen
   Know how to format strings safely with regular "C"
    functions
   Learn other defenses against the exploitation of
    format string vulnerabilities
Format String Issues: Outline


   Introduction to format strings
   Fundamental "C" problem
   Examples
   Definition
   Importance
   Survey of unsafe functions
   Case study: analysis of cfingerd 1.4.3 vulnerabilities
   Preventing format string vulnerabilities without
    programming
   Lab: Find and fix format string vulnerabilities
   Tools to find string format issues
What is a Format String?


   In “C”, you can print using a format string:
   printf(const char *format, ...);
   printf(“Mary has %d cats”, cats);
     – %d specifies a decimal number (from an int)
     – %s would specify a string argument,
     – %X would specify an unsigned uppercase hexadecimal
       (from an int)
     – %f expects a double and converts it into decimal notation,
       rounding as specified by a precision argument
     – etc...
Fundamental "C" Problem


   No way to count arguments passed to a "C"
    function, so missing arguments are not detected
   Format string is interpreted: it mixes code and data
   What happens if the following code is run?
   int main () {
        printf("Mary has %d cats");
    }
Result


   % ./a.out
    Mary has -1073742416 cats
   Program reads missing arguments off the stack!
      – And gets garbage (or interesting stuff if you want to probe
        the stack)
Probing the Stack


   Read values off stack
   Confidentiality violations
   printf(“%08X”)
    x (X) is unsigned hexadecimal
    0: with ‘0’ padding
    8 characters wide: ‘0XAA03BF54’
    4 bytes = pointer on stack, canary, etc...
User-specified Format String


   What happens if the following code is run,
    assuming there always is an argument input by a
    user?
   int main(int argc, char *argv[])
    {
        printf(argv[1]);
        exit(0);
    }
   Try it and input "%s%s%s%s%s%s%s%s%s"
    How many "%s" arguments do you need to crash
    it?
Result


   % ./a.out "%s%s%s%s%s%s%s"
    Bus error
   Program was terminated by OS
     – Segmentation fault, bus error, etc... because the program
       attempted to read where it wasn't supposed to
   User input is interpreted as string format (e.g., %s,
    %d, etc...)
   Anything can happen, depending on input!
   How would you correct the program?
Corrected Program


   int
    main(int argc, char *argv[])
    {
        printf(“%s”, argv[1]);
        exit(0);
    }
   % ./a.out "%s%s%s%s%s%s%s"
    %s%s%s%s%s%s%s
Format String Vulnerabilities


   Discovered relatively recently ~2000
   Limitation of “C” family languages
   Versatile
     – Can affect various memory locations
     – Can be used to create buffer overflows
     – Can be used to read the stack
   Not straightforward to exploit, but examples of root
    compromise scripts are available on the web
     – "Modify and hack from example"
Definition of a Format String Vulnerability


   A call to a function with a format string argument,
    where the format string is either:
     – Possibly under the control of an attacker
     – Not followed by the appropriate number of arguments
   As it is difficult to establish whether a data string
    could possibly be affected by an attacker, it is
    considered very bad practice to place a string to
    print as the format string argument.
     – Sometimes the bad practice is confused with the actual
       presence of a format string vulnerability
How Important Are Format String
Vulnerabilities?

   Search NVD (icat) for “format string”:
     –   115 records in 2002
     –   153 total in 2003
     –   173 total in April 2004
     –   363 in February 2006
   Various applications
     –   Databases (Oracle)
     –   Unix services (syslog, ftp,...)
     –   Linux “super” (for managing setuid functions)
     –   cfingerd CVE-2001-0609
   Arbitrary code execution is a frequent consequence
Functions Using Format Strings


     printf - prints to"stdout" stream
     fprintf - prints to stream
     warn - standard error output
     err - standard error output
     setproctitle - sets the invoking process's title
     sprintf(char *str, const char *format, ...);
      – sprintf prints to a buffer
      – What’s the problem with that?
Sprintf Double Whammy


   format string AND buffer overflow issues!
   Buffer and format string are usually on the stack
   Buffer overflow rewrites the stack using values in
    the format string
Better Functions Than sprintf


   Note that these don't prevent format string
    vulnerabilities:
     – snprintf(char *str, size_t size, const char *format, ...);
          sprintf with length check for "size"
     – asprintf(char **ret, const char *format, ...);
          sets *ret to be a pointer to a buffer sufficiently large to hold
           the formatted string (note the potential memory leak).
Custom Functions Using Format Strings


   It is possible to define custom functions taking
    arguments similar to printf.
   wu-ftpd 2.6.1 proto.h
     – void reply(int, char *fmt,...);
     – void lreply(int, char *fmt,...);
     – etc...
   Can produce the same kinds of vulnerabilities if an
    attacker can control the format string
Write Anything Anywhere


   "%n" format command
   Writes a number to the location specified by
    argument on the stack
     – Argument treated as int pointer
         Often either the buffer being written to, or the raw input, are
          somewhere on the stack
            –   Attacker controls the pointer value!
     – Writes the number of characters written so far
         Keeps counting even if buffer size limit was reached!
         “Count these characters %n”
   All the gory details you don't really need to know:
     – Newsham T (2000) "Format String Attacks"
Case Study: Cfingerd 1.4.3


   Finger replacement
    – Runs as root
    – Pscan output: (CVE-2001-0609)
        defines.h:22 SECURITY: printf call should have "%s" as
         argument 0
        main.c:245 SECURITY: syslog call should have "%s" as
         argument 1
        main.c:258 SECURITY: syslog call should have "%s" as
         argument 1
        standard.c:765 SECURITY: printf call should have "%s" as
         argument 0
        etc... (10 instances total)
    – Discovery: Megyer Laszlo, a.k.a. "Lez"
Cfingerd Analysis


   Most of these issues are not exploitable, but one is,
    indirectly at that...
   Algorithm (simplified):
     – Receive an incoming connection
         get the fingered username
     – Perform an ident check (RFC 1413) to learn and log the
       identity of the remote user
     – Copy the remote username into a buffer
     – Copy that again into "username@remote_address"
         remote_address would identify attack source
     – Answer the finger request
     – Log it
Cfingerd Vulnerabilities


   A string format vulnerability giving root access:
     – Remote data (ident_user) is used to construct the
       format string:
     – snprintf(syslog_str, sizeof(syslog_str),
           "%s fingered from %s",
           username, ident_user
       );
       syslog(LOG_NOTICE, (char *) syslog_str);
   An off-by-one string manipulation (buffer overflow)
    vulnerability that
     – prevents remote_address from being logged (useful if
       attack is unsuccessful, or just to be anonymous)
     – Allows ident_user to be larger (and contain shell code)
Cfingerd Buffer Overflow Vulnerability


   memset(uname, 0, sizeof(uname));
    for (xp=uname;
        *cp!='0' && *cp!='r' &&
    *cp!='n'
        && strlen(uname) < sizeof(uname);
        cp++
    )
       *(xp++) = *cp;
   Off-by-one string handling error
     – uname is not NUL-terminated!
     – because strlen doesn't count the NUL
   It will stop copying when strlen goes reading off
    outside the buffer
Direct Effect of Off-by-one Error


   char buf[BUFLEN], uname[64];
   "uname" and "buf" are "joined" as one string!
   So, even if only 64 characters from the input are
    copied into "uname", string manipulation functions
    will work with "uname+buf" as a single entity
   "buf" was used to read the response from the
    ident server so it is the raw input
Consequences of Off-by-one Error


  1) Remote address is not logged due to size
     restriction:
        snprintf(bleah, BUFLEN, "%s@%s", uname,
         remote_addr);
        Can keep trying various technical adjustments
         (alignments, etc...) until the attack works, anonymously
  2) There's enough space for format strings,
     alignment characters and shell code in buf (~60
     bytes for shell code):
        Rooted (root compromise) when syslog call is made
            i.e., cracker gains root privileges on the computer
             (equivalent to LocalSystem account)
Preventing Format String Vulnerabilities


  1) Always specify a format string
        Most format string vulnerabilities are solved by specifying
         "%s" as format string and not using the data string as
         format string
  2) If possible, make the format string a constant
        Extract all the variable parts as other arguments to the call
        Difficult to do with some internationalization libraries
  3) If the above two practices are not possible, use
     defenses such as FormatGuard (see next slides)
     – Rare at design time
     – Perhaps a way to keep using a legacy application and
       keep costs down
     – Increase trust that a third-party application will be safe
Windows


  Demo code for format string exploit in Howard and
   Leblanc (2nd Ed.)
    – Same mechanisms as in UNIX-type systems
    – Prevented the same way
Defenses Against Exploitation


   FormatGuard
    – Use compiler macro tricks to count arguments passed
        Special header file
    – Patch to glibc
        Printf wrapper that counts the arguments needed by format
         string and verifies against the count of arguments passed
    – Kill process if mismatch
        What’s the problem with that?
FormatGuard Limitations


   What do you do if there's a mismatch in the
    argument count?
     – Terminate it (kill)
          Not complete fix, but DoS preferable to root compromise
     – If process is an important process that gets killed, Denial-
       of-Service attacks are still possible
          Although if you only manage to kill a "child process"
           processing your own attack, there's no harm done
FormatGuard Limitations (Cont.)


   Doesn't work when program bypasses
    FormatGuard by using own printf version or library
     – wu-ftpd had its own printf
     – gftp used Glib library
     – Side note: See how custom versions of standard
       functions make retrofit solutions more difficult?
          Code duplication makes patching more difficult
           Secure programming is the most secure option
Code Scanners


   Pscan searches for format string functions called
    with the data string as format string
     – Can also look for custom functions
         Needs a helper file that can be generated automatically
            –   Pscan helper file generator at
                http://www.cerias.purdue.edu/homes/pmeunier/dir_pscan.html
     – Few false positives
   http://www.striker.ottawa.on.ca/~aland/pscan/
gcc Options


   -Wformat (man gcc)
    – "Check calls to "printf" and "scanf", etc., to make sure that
      the arguments supplied have types appropriate to the
      format string specified, and that the conversions specified
      in the format string make sense. "
    – Also checks for null format arguments for several functions
         -Wformat also implies -Wnonnull
   -Wformat-nonliteral (man gcc)
    – "If -Wformat is specified, also warn if the format string is
      not a string literal and so cannot be checked, unless the
      format function takes its format arguments as a "va_list"."
gcc Options


   -Wformat-security (man gcc)
    – "If -Wformat is specified, also warn about uses of format
      functions that represent possible security problems. At
      present, this warns about calls to "printf" and "scanf"
      functions where the format string is not a string literal and
      there are no format arguments, as in "printf (foo);". This
      may be a security hole if the format string came from
      untrusted input and contains %n. (This is currently a
      subset of what -Wformat-nonliteral warns about, but in
      future warnings may be added to -Wformat-security that
      are not included in -Wformat-nonliteral.)"
   -Wformat=2
    – Equivalent to -Wformat -Wformat-nonliteral -Wformat-
      security.
Making gcc Look for Custom Functions


   Function attributes
     – Keyword "__attribute__" followed by specification
     – For format strings, use "__attribute__ ((format))"
     – Example:
         my_printf (void *my_object,
                     const char *my_format, ...)
            __attribute__ ((format (printf, 2, 3)));
   gcc can help you find functions that might benefit
    from a format attribute:
     – Switch: "-Wmissing-format-attribute"
     – Prints "warning: function might be possible
       candidate for `printf' format attribute"
       when appropriate
Lab


  Jared wrote a server program with examples of
   format string vulnerabilities
      – Get it from the USB keyring, web site or FTP server
        (follow the instructor's directions)
  Usage
      – Compile with 'make vuln_server'
      – Run with './vuln_server 5555'
      – Open another shell window and type 'telnet localhost
        5555'
      – Find and fix all format string vulnerabilities
      – Try the gcc switches
First Lab Vulnerability


   What happens when you type in the string "Hello
    world!"? ( it's printed back in reverse)
   Type in a long string (more than 100 characters). It
    should crash. Where is the buffer overflow?
   Fix the buffer overflow, recompile, and
    demonstrate that it doesn't crash on long input
    lines any more.
   Bonus: Can you get a shell?
     – We didn't teach how to do that because our primary goal
       is to teach how to avoid the vulnerabilities, and as this lab
       demonstrates, you can do that without knowing how to
       get a shell
Second Lab Vulnerability


  1) Where is the format string problem?
  2) How do you crash the program? Hint: use %s
  3) How do you print the contents of memory to
     divulge the secret which is 0xdeadc0de? Hint: use
     %08x
  4) Bonus: Can you get a shell?
Third Lab Vulnerability


   Latent vulnerability hidden somewhere...
Questions or Comments?


  §
About These Slides


     You are free to copy, distribute, display, and perform the work; and
      to make derivative works, under the following conditions.
      –   You must give the original author and other contributors credit
      –   The work will be used for personal or non-commercial educational uses
          only, and not for commercial activities and purposes
      –   For any reuse or distribution, you must make clear to others the terms of
          use for this work
      –   Derivative works must retain and be subject to the same conditions, and
          contain a note identifying the new contributor(s) and date of modification
      –   For other uses please contact the Purdue Office of Technology
          Commercialization.

   Developed thanks to the support of Symantec
    Corporation
Pascal Meunier
pmeunier@purdue.edu
Contributors:
Jared Robinson, Alan Krassowski, Craig Ozancin, Tim
Brown, Wes Higaki, Melissa Dark, Chris Clifton, Gustavo
Rodriguez-Rivera

More Related Content

PPTX
[MOSUT] Format String Attacks
PPTX
Control hijacking
PPT
6 buffer overflows
PPT
Buffer Overflows
PPT
Buffer Overflow Attacks
PDF
Valgrind overview: runtime memory checker and a bit more aka использование #v...
PDF
System Hacking Tutorial #1 - Introduction to Vulnerability and Type of Vulner...
PDF
Common mistakes in C programming
[MOSUT] Format String Attacks
Control hijacking
6 buffer overflows
Buffer Overflows
Buffer Overflow Attacks
Valgrind overview: runtime memory checker and a bit more aka использование #v...
System Hacking Tutorial #1 - Introduction to Vulnerability and Type of Vulner...
Common mistakes in C programming

What's hot (19)

PDF
Yandex may 2013 a san-tsan_msan
PDF
Valgrind
PPTX
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
PDF
Rust LDN 24 7 19 Oxidising the Command Line
PDF
Linux on System z debugging with Valgrind
PDF
100 bugs in Open Source C/C++ projects
PDF
Mathematicians: Trust, but Verify
PDF
System Hacking Tutorial #2 - Buffer Overflow - Overwrite EIP
DOCX
Valgrind debugger Tutorial
KEY
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
PDF
Lets make better scripts
PDF
Handling inline assembly in Clang and LLVM
PDF
Valgrind tutorial
PPTX
Price of an Error
PPTX
Bioinformatics v2014 wim_vancriekinge
PDF
System Hacking Tutorial #3 - Buffer Overflow - Egg Hunting
PDF
Better Embedded 2013 - Detecting Memory Leaks with Valgrind
PDF
Rust tutorial from Boston Meetup 2015-07-22
Yandex may 2013 a san-tsan_msan
Valgrind
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
Rust LDN 24 7 19 Oxidising the Command Line
Linux on System z debugging with Valgrind
100 bugs in Open Source C/C++ projects
Mathematicians: Trust, but Verify
System Hacking Tutorial #2 - Buffer Overflow - Overwrite EIP
Valgrind debugger Tutorial
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Lets make better scripts
Handling inline assembly in Clang and LLVM
Valgrind tutorial
Price of an Error
Bioinformatics v2014 wim_vancriekinge
System Hacking Tutorial #3 - Buffer Overflow - Egg Hunting
Better Embedded 2013 - Detecting Memory Leaks with Valgrind
Rust tutorial from Boston Meetup 2015-07-22
Ad

Viewers also liked (8)

DOC
Exploit Frameworks
PDF
Bug Hunting with Media Formats
PPT
Hacking A Web Site And Secure Web Server Techniques Used
PDF
DrupalCamp London 2017 - Web site insecurity
PPTX
Oracle ASM Training
PDF
CNIT 127 Ch 4: Introduction to format string bugs (rev. 2-9-17)
PPTX
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
PDF
Rand rr1751
Exploit Frameworks
Bug Hunting with Media Formats
Hacking A Web Site And Secure Web Server Techniques Used
DrupalCamp London 2017 - Web site insecurity
Oracle ASM Training
CNIT 127 Ch 4: Introduction to format string bugs (rev. 2-9-17)
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Rand rr1751
Ad

Similar to 2.Format Strings (20)

PPTX
C format string vulnerability
ODP
Format string vunerability
PDF
Format string
PPTX
Buffer overflow
PPTX
C programming language tutorial
PPTX
C_Progragramming_language_Tutorial_ppt_f.pptx
PPTX
Rust Hack
PPTX
Introduction to c
PDF
1.Buffer Overflows
PDF
Software Security
DOCX
COMP 2103X1 Assignment 2Due Thursday, January 26 by 700 PM.docx
PDF
Arduino reference
ODP
BufferOverflow - Offensive point of View
PPTX
antoanthongtin_Lesson 3- Software Security (1).pptx
PDF
Secure Coding Practices for Middleware
PDF
Format string vunerability
PDF
Software Security - Static Analysis Tools
PPT
Stream Based Input Output
PPT
CInputOutput.ppt
PDF
stackconf 2021 | Fuzzing: Finding Your Own Bugs and 0days!
C format string vulnerability
Format string vunerability
Format string
Buffer overflow
C programming language tutorial
C_Progragramming_language_Tutorial_ppt_f.pptx
Rust Hack
Introduction to c
1.Buffer Overflows
Software Security
COMP 2103X1 Assignment 2Due Thursday, January 26 by 700 PM.docx
Arduino reference
BufferOverflow - Offensive point of View
antoanthongtin_Lesson 3- Software Security (1).pptx
Secure Coding Practices for Middleware
Format string vunerability
Software Security - Static Analysis Tools
Stream Based Input Output
CInputOutput.ppt
stackconf 2021 | Fuzzing: Finding Your Own Bugs and 0days!

More from phanleson (20)

PDF
Learning spark ch01 - Introduction to Data Analysis with Spark
PPT
Firewall - Network Defense in Depth Firewalls
PPT
Mobile Security - Wireless hacking
PPT
Authentication in wireless - Security in Wireless Protocols
PPT
E-Commerce Security - Application attacks - Server Attacks
PPT
Hacking web applications
PPTX
HBase In Action - Chapter 04: HBase table design
PPT
HBase In Action - Chapter 10 - Operations
PPT
Hbase in action - Chapter 09: Deploying HBase
PPTX
Learning spark ch11 - Machine Learning with MLlib
PPTX
Learning spark ch10 - Spark Streaming
PPTX
Learning spark ch09 - Spark SQL
PPT
Learning spark ch07 - Running on a Cluster
PPTX
Learning spark ch06 - Advanced Spark Programming
PPTX
Learning spark ch05 - Loading and Saving Your Data
PPTX
Learning spark ch04 - Working with Key/Value Pairs
PPTX
Learning spark ch01 - Introduction to Data Analysis with Spark
PPT
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
PPT
Lecture 1 - Getting to know XML
PPTX
Lecture 4 - Adding XTHML for the Web
Learning spark ch01 - Introduction to Data Analysis with Spark
Firewall - Network Defense in Depth Firewalls
Mobile Security - Wireless hacking
Authentication in wireless - Security in Wireless Protocols
E-Commerce Security - Application attacks - Server Attacks
Hacking web applications
HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 10 - Operations
Hbase in action - Chapter 09: Deploying HBase
Learning spark ch11 - Machine Learning with MLlib
Learning spark ch10 - Spark Streaming
Learning spark ch09 - Spark SQL
Learning spark ch07 - Running on a Cluster
Learning spark ch06 - Advanced Spark Programming
Learning spark ch05 - Loading and Saving Your Data
Learning spark ch04 - Working with Key/Value Pairs
Learning spark ch01 - Introduction to Data Analysis with Spark
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
Lecture 1 - Getting to know XML
Lecture 4 - Adding XTHML for the Web

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Tartificialntelligence_presentation.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Getting Started with Data Integration: FME Form 101
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
A Presentation on Artificial Intelligence
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Machine learning based COVID-19 study performance prediction
Encapsulation_ Review paper, used for researhc scholars
A comparative analysis of optical character recognition models for extracting...
Advanced methodologies resolving dimensionality complications for autism neur...
Unlocking AI with Model Context Protocol (MCP)
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Tartificialntelligence_presentation.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Getting Started with Data Integration: FME Form 101
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Network Security Unit 5.pdf for BCA BBA.
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
A Presentation on Artificial Intelligence
Digital-Transformation-Roadmap-for-Companies.pptx
NewMind AI Weekly Chronicles - August'25-Week II
Machine learning based COVID-19 study performance prediction

2.Format Strings

  • 1. Course 2: Programming Issues, Section 2 Pascal Meunier, Ph.D., M.Sc., CISSP May 2004; updated July 30, 2004 Developed thanks to the support of Symantec Corporation, NSF SFS Capacity Building Program (Award Number 0113725) and the Purdue e-Enterprise Center Copyright (2004) Purdue Research Foundation. All rights reserved.
  • 2. Course 2 Learning Plan  Buffer Overflows  Format String Vulnerabilities  Code Injection and Input Validation  Cross-site Scripting Vulnerabilities  Links and Race Conditions  Temporary Files and Randomness  Canonicalization and Directory Traversal
  • 3. Learning objectives  Learn that format strings are interpreted, therefore are similar to code  Understand the definition of a format string vulnerability  Know how they happen  Know how to format strings safely with regular "C" functions  Learn other defenses against the exploitation of format string vulnerabilities
  • 4. Format String Issues: Outline  Introduction to format strings  Fundamental "C" problem  Examples  Definition  Importance  Survey of unsafe functions  Case study: analysis of cfingerd 1.4.3 vulnerabilities  Preventing format string vulnerabilities without programming  Lab: Find and fix format string vulnerabilities  Tools to find string format issues
  • 5. What is a Format String?  In “C”, you can print using a format string:  printf(const char *format, ...);  printf(“Mary has %d cats”, cats); – %d specifies a decimal number (from an int) – %s would specify a string argument, – %X would specify an unsigned uppercase hexadecimal (from an int) – %f expects a double and converts it into decimal notation, rounding as specified by a precision argument – etc...
  • 6. Fundamental "C" Problem  No way to count arguments passed to a "C" function, so missing arguments are not detected  Format string is interpreted: it mixes code and data  What happens if the following code is run?  int main () { printf("Mary has %d cats"); }
  • 7. Result  % ./a.out Mary has -1073742416 cats  Program reads missing arguments off the stack! – And gets garbage (or interesting stuff if you want to probe the stack)
  • 8. Probing the Stack  Read values off stack  Confidentiality violations  printf(“%08X”) x (X) is unsigned hexadecimal 0: with ‘0’ padding 8 characters wide: ‘0XAA03BF54’ 4 bytes = pointer on stack, canary, etc...
  • 9. User-specified Format String  What happens if the following code is run, assuming there always is an argument input by a user?  int main(int argc, char *argv[]) { printf(argv[1]); exit(0); }  Try it and input "%s%s%s%s%s%s%s%s%s" How many "%s" arguments do you need to crash it?
  • 10. Result  % ./a.out "%s%s%s%s%s%s%s" Bus error  Program was terminated by OS – Segmentation fault, bus error, etc... because the program attempted to read where it wasn't supposed to  User input is interpreted as string format (e.g., %s, %d, etc...)  Anything can happen, depending on input!  How would you correct the program?
  • 11. Corrected Program  int main(int argc, char *argv[]) { printf(“%s”, argv[1]); exit(0); }  % ./a.out "%s%s%s%s%s%s%s" %s%s%s%s%s%s%s
  • 12. Format String Vulnerabilities  Discovered relatively recently ~2000  Limitation of “C” family languages  Versatile – Can affect various memory locations – Can be used to create buffer overflows – Can be used to read the stack  Not straightforward to exploit, but examples of root compromise scripts are available on the web – "Modify and hack from example"
  • 13. Definition of a Format String Vulnerability  A call to a function with a format string argument, where the format string is either: – Possibly under the control of an attacker – Not followed by the appropriate number of arguments  As it is difficult to establish whether a data string could possibly be affected by an attacker, it is considered very bad practice to place a string to print as the format string argument. – Sometimes the bad practice is confused with the actual presence of a format string vulnerability
  • 14. How Important Are Format String Vulnerabilities?  Search NVD (icat) for “format string”: – 115 records in 2002 – 153 total in 2003 – 173 total in April 2004 – 363 in February 2006  Various applications – Databases (Oracle) – Unix services (syslog, ftp,...) – Linux “super” (for managing setuid functions) – cfingerd CVE-2001-0609  Arbitrary code execution is a frequent consequence
  • 15. Functions Using Format Strings  printf - prints to"stdout" stream  fprintf - prints to stream  warn - standard error output  err - standard error output  setproctitle - sets the invoking process's title  sprintf(char *str, const char *format, ...); – sprintf prints to a buffer – What’s the problem with that?
  • 16. Sprintf Double Whammy  format string AND buffer overflow issues!  Buffer and format string are usually on the stack  Buffer overflow rewrites the stack using values in the format string
  • 17. Better Functions Than sprintf  Note that these don't prevent format string vulnerabilities: – snprintf(char *str, size_t size, const char *format, ...);  sprintf with length check for "size" – asprintf(char **ret, const char *format, ...);  sets *ret to be a pointer to a buffer sufficiently large to hold the formatted string (note the potential memory leak).
  • 18. Custom Functions Using Format Strings  It is possible to define custom functions taking arguments similar to printf.  wu-ftpd 2.6.1 proto.h – void reply(int, char *fmt,...); – void lreply(int, char *fmt,...); – etc...  Can produce the same kinds of vulnerabilities if an attacker can control the format string
  • 19. Write Anything Anywhere  "%n" format command  Writes a number to the location specified by argument on the stack – Argument treated as int pointer  Often either the buffer being written to, or the raw input, are somewhere on the stack – Attacker controls the pointer value! – Writes the number of characters written so far  Keeps counting even if buffer size limit was reached!  “Count these characters %n”  All the gory details you don't really need to know: – Newsham T (2000) "Format String Attacks"
  • 20. Case Study: Cfingerd 1.4.3  Finger replacement – Runs as root – Pscan output: (CVE-2001-0609)  defines.h:22 SECURITY: printf call should have "%s" as argument 0  main.c:245 SECURITY: syslog call should have "%s" as argument 1  main.c:258 SECURITY: syslog call should have "%s" as argument 1  standard.c:765 SECURITY: printf call should have "%s" as argument 0  etc... (10 instances total) – Discovery: Megyer Laszlo, a.k.a. "Lez"
  • 21. Cfingerd Analysis  Most of these issues are not exploitable, but one is, indirectly at that...  Algorithm (simplified): – Receive an incoming connection  get the fingered username – Perform an ident check (RFC 1413) to learn and log the identity of the remote user – Copy the remote username into a buffer – Copy that again into "username@remote_address"  remote_address would identify attack source – Answer the finger request – Log it
  • 22. Cfingerd Vulnerabilities  A string format vulnerability giving root access: – Remote data (ident_user) is used to construct the format string: – snprintf(syslog_str, sizeof(syslog_str), "%s fingered from %s", username, ident_user ); syslog(LOG_NOTICE, (char *) syslog_str);  An off-by-one string manipulation (buffer overflow) vulnerability that – prevents remote_address from being logged (useful if attack is unsuccessful, or just to be anonymous) – Allows ident_user to be larger (and contain shell code)
  • 23. Cfingerd Buffer Overflow Vulnerability  memset(uname, 0, sizeof(uname)); for (xp=uname; *cp!='0' && *cp!='r' && *cp!='n' && strlen(uname) < sizeof(uname); cp++ ) *(xp++) = *cp;  Off-by-one string handling error – uname is not NUL-terminated! – because strlen doesn't count the NUL  It will stop copying when strlen goes reading off outside the buffer
  • 24. Direct Effect of Off-by-one Error  char buf[BUFLEN], uname[64];  "uname" and "buf" are "joined" as one string!  So, even if only 64 characters from the input are copied into "uname", string manipulation functions will work with "uname+buf" as a single entity  "buf" was used to read the response from the ident server so it is the raw input
  • 25. Consequences of Off-by-one Error 1) Remote address is not logged due to size restriction:  snprintf(bleah, BUFLEN, "%s@%s", uname, remote_addr);  Can keep trying various technical adjustments (alignments, etc...) until the attack works, anonymously 2) There's enough space for format strings, alignment characters and shell code in buf (~60 bytes for shell code):  Rooted (root compromise) when syslog call is made  i.e., cracker gains root privileges on the computer (equivalent to LocalSystem account)
  • 26. Preventing Format String Vulnerabilities 1) Always specify a format string  Most format string vulnerabilities are solved by specifying "%s" as format string and not using the data string as format string 2) If possible, make the format string a constant  Extract all the variable parts as other arguments to the call  Difficult to do with some internationalization libraries 3) If the above two practices are not possible, use defenses such as FormatGuard (see next slides) – Rare at design time – Perhaps a way to keep using a legacy application and keep costs down – Increase trust that a third-party application will be safe
  • 27. Windows  Demo code for format string exploit in Howard and Leblanc (2nd Ed.) – Same mechanisms as in UNIX-type systems – Prevented the same way
  • 28. Defenses Against Exploitation  FormatGuard – Use compiler macro tricks to count arguments passed  Special header file – Patch to glibc  Printf wrapper that counts the arguments needed by format string and verifies against the count of arguments passed – Kill process if mismatch  What’s the problem with that?
  • 29. FormatGuard Limitations  What do you do if there's a mismatch in the argument count? – Terminate it (kill)  Not complete fix, but DoS preferable to root compromise – If process is an important process that gets killed, Denial- of-Service attacks are still possible  Although if you only manage to kill a "child process" processing your own attack, there's no harm done
  • 30. FormatGuard Limitations (Cont.)  Doesn't work when program bypasses FormatGuard by using own printf version or library – wu-ftpd had its own printf – gftp used Glib library – Side note: See how custom versions of standard functions make retrofit solutions more difficult?  Code duplication makes patching more difficult Secure programming is the most secure option
  • 31. Code Scanners  Pscan searches for format string functions called with the data string as format string – Can also look for custom functions  Needs a helper file that can be generated automatically – Pscan helper file generator at http://www.cerias.purdue.edu/homes/pmeunier/dir_pscan.html – Few false positives  http://www.striker.ottawa.on.ca/~aland/pscan/
  • 32. gcc Options  -Wformat (man gcc) – "Check calls to "printf" and "scanf", etc., to make sure that the arguments supplied have types appropriate to the format string specified, and that the conversions specified in the format string make sense. " – Also checks for null format arguments for several functions  -Wformat also implies -Wnonnull  -Wformat-nonliteral (man gcc) – "If -Wformat is specified, also warn if the format string is not a string literal and so cannot be checked, unless the format function takes its format arguments as a "va_list"."
  • 33. gcc Options  -Wformat-security (man gcc) – "If -Wformat is specified, also warn about uses of format functions that represent possible security problems. At present, this warns about calls to "printf" and "scanf" functions where the format string is not a string literal and there are no format arguments, as in "printf (foo);". This may be a security hole if the format string came from untrusted input and contains %n. (This is currently a subset of what -Wformat-nonliteral warns about, but in future warnings may be added to -Wformat-security that are not included in -Wformat-nonliteral.)"  -Wformat=2 – Equivalent to -Wformat -Wformat-nonliteral -Wformat- security.
  • 34. Making gcc Look for Custom Functions  Function attributes – Keyword "__attribute__" followed by specification – For format strings, use "__attribute__ ((format))" – Example:  my_printf (void *my_object, const char *my_format, ...) __attribute__ ((format (printf, 2, 3)));  gcc can help you find functions that might benefit from a format attribute: – Switch: "-Wmissing-format-attribute" – Prints "warning: function might be possible candidate for `printf' format attribute" when appropriate
  • 35. Lab  Jared wrote a server program with examples of format string vulnerabilities – Get it from the USB keyring, web site or FTP server (follow the instructor's directions)  Usage – Compile with 'make vuln_server' – Run with './vuln_server 5555' – Open another shell window and type 'telnet localhost 5555' – Find and fix all format string vulnerabilities – Try the gcc switches
  • 36. First Lab Vulnerability  What happens when you type in the string "Hello world!"? ( it's printed back in reverse)  Type in a long string (more than 100 characters). It should crash. Where is the buffer overflow?  Fix the buffer overflow, recompile, and demonstrate that it doesn't crash on long input lines any more.  Bonus: Can you get a shell? – We didn't teach how to do that because our primary goal is to teach how to avoid the vulnerabilities, and as this lab demonstrates, you can do that without knowing how to get a shell
  • 37. Second Lab Vulnerability 1) Where is the format string problem? 2) How do you crash the program? Hint: use %s 3) How do you print the contents of memory to divulge the secret which is 0xdeadc0de? Hint: use %08x 4) Bonus: Can you get a shell?
  • 38. Third Lab Vulnerability  Latent vulnerability hidden somewhere...
  • 40. About These Slides  You are free to copy, distribute, display, and perform the work; and to make derivative works, under the following conditions. – You must give the original author and other contributors credit – The work will be used for personal or non-commercial educational uses only, and not for commercial activities and purposes – For any reuse or distribution, you must make clear to others the terms of use for this work – Derivative works must retain and be subject to the same conditions, and contain a note identifying the new contributor(s) and date of modification – For other uses please contact the Purdue Office of Technology Commercialization.  Developed thanks to the support of Symantec Corporation
  • 41. Pascal Meunier [email protected] Contributors: Jared Robinson, Alan Krassowski, Craig Ozancin, Tim Brown, Wes Higaki, Melissa Dark, Chris Clifton, Gustavo Rodriguez-Rivera