Oscar Forner

Introduction

Clang is a compiler front end for the C, C++, Objective-C and Objective-C++ programming languages. It uses LLVM as its back end. In this post I talk about some of the sanitizers available in Clang (some are avilable in GCC as well). They help you detect problems at run time (dynamic analysis).

As usual, I am working from an Arch Linux computer. Therefore, I can install Clang and the tools from the repository (clang). For other distributions you can find the information in the documentation.

As always, all the code used in this post is available in this repo.

The videos are made with asciinema, that means you can copy from the video.

Clang sanitizers

The Clang sanitizers available are:

AddressSanitizer: detects memory errors. http://clang.llvm.org/docs/AddressSanitizer.html
ThreadSanitizer: detects data races. http://clang.llvm.org/docs/ThreadSanitizer.html
MemorySanitizer: detects uninitialized reads. http://clang.llvm.org/docs/MemorySanitizer.html
UndefinedBehaviorSanitizer: detects undefined behavior. http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
DataFlowSanitizer: is a generalised dynamic data flow analysis. Unlike other sanitizers this one is not designed to detect a specific class of bugs on its own, it provides a generic dynamic data flow analysis framework to be used by clients to help detect application-specific issues within their own code. http://clang.llvm.org/docs/DataFlowSanitizer.html
LeakSanitizer: detects memory leaks. It can be combined with AddressSanitizer. http://clang.llvm.org/docs/LeakSanitizer.html

In this post I show AddressSanitizer, ThreadSanitizer, MemorySanitizer and UndefinedBehaviorSanitizer. I do not talk about DataFlowSanitizer because it is a work in progress and it is really specific of the application where you need it. Moreover, I do not talk about LeakSanitizzer because it is a subset of checks from AddressSanitizer that can be run independently from it.

AddressSanitizer

Usage of freed memory

The code below is trying to use a region of memory that has been freed already.

int main()
{
  int *array = new int[100];
  delete [] array;
  return array[1];
}

We can compile it with:

clang++-3.8 -fsanitize=address -g -o free free.cpp

When you run the previously generated executable you will get something similar to the following:

Buffer overflow

The code below is trying to access a memory region that is not part of the allocated one.

int main()
{
  int * array = new int[50];
  return array[100];
}

We can compile it with:

clang++-3.8 -fsanitize=address -g -o overflow overflow.cpp

When you run the previously generated executable you will get something similar to the following:

Memory leak

The code below is allocating memory twice but it only frees the memory once.

int main()
{
  int * array = new int[5];
  array = new int[10];
  delete [] array;
  return 0;
}

We can compile it with:

clang++-3.8 -fsanitize=address -g -o leak leak.cpp

When you run the previously generated executable you will get something similar to the following:

Double free

The code below is allocating memory once but it is freeing it twice.

int main()
{
  int * array = new int[5];
  delete [] array;
  delete [] array;
  return 0;
}

We can compile it with:

clang++-3.8 -fsanitize=address -g -o double_free double_free.cpp

When you run the previously generated executable you will get something similar to the following:

ThreadSanitizer

Data race

The code below has a data race due to having two threads modifying the same global variable.

#include <iostream>
#include <pthread.h>

int GLOBAL;

void * SetGlobalTo2(void * x) {
  GLOBAL = 2;
  return x;
}

void * SetGlobalTo3(void * x) {
  GLOBAL = 3;
  return x;
}

int main()
{
  pthread_t thread1, thread2;
  pthread_create(&thread1, NULL, SetGlobalTo2, NULL);
  pthread_create(&thread2, NULL, SetGlobalTo3, NULL);
  pthread_join(thread1, NULL);
  pthread_join(thread2, NULL);
  std::cout << GLOBAL << std::endl;
  return 0;
}

We can compile it with:

clang++-3.8 -fsanitize=thread -g -lpthread -o race race.cpp

When you run the previously generated executable you will get something similar to the following:

MemorySanitizer

Uninitialized values

The code below is reading the value stored in an array that has not been initialized.

int main()
{
  int *array = new int[50];
  return array[42];
}

We can compile it with:

clang++-3.8 -fsanitize=memory -g -o memory memory.cpp

When you run the previously generated executable you will get something similar to the following:

UndefinedBehaviorSanitizer

Function not returning a value

The code below contains a function that must return an integer, but it does not.

int must_return_value()
{
  int result = 0;
  result += 1;
}

int main()
{
  int value = must_return_value();
  return value;
}

We can compile it with:

clang++-3.8 -fsanitize=undefined -g -o function function.cpp

When you run the previously generated executable you will get something similar to the following:

Conclusion

Using these tools to compile your code to run your tests (integration test, smoke test, system test, etc) helps you catch plenty of problems. The downside is that you cannot use two sanitizers at the same time (except AddressSanitizer and LeakSanitizer). Therefore, you need multiple binaries to test your code with all these sanitizers, but the payoff is worth it.

Clang sanitizers

Date: 13/07/2016

Introduction

Clang sanitizers

AddressSanitizer

Usage of freed memory

Buffer overflow

Memory leak

Double free

ThreadSanitizer

Data race

MemorySanitizer

Uninitialized values

UndefinedBehaviorSanitizer

Function not returning a value

Conclusion