Carving utmp records for intrusion analysis

When I investigate Unix-like systems (mostly Linux) for intrusion case, I always check utmpwtmp and btmp to track of suspicious login/logout. These files are not text, and I have been using last/lastb command with -f option. However, sometimes its files have empty data because an attacker can remove these files if he/she has administrative privileges. 

btmp and wtmp size 0

That's why I have created utmp scanner on bulk_extractor-rec.


The format varies from OS such as Linux, FreeBSD and etc. now I have been focused on Linux format because I have very few experience other than Linux...

There are three types of files, utmp, wtmp, and btmp. utmp maintains only current status, wtmp is a historical  utmp, and btmp records failed login attempts. All of these types has the same format which is described at 'man utmp 5'.  

The key definition is as follows:

#define EMPTY         0 /* Record does not contain valid info (formerly known as UT_UNKNOWN on Linux) */
#define RUN_LVL       1 /* Change in system run-level (see init(8)) */
#define BOOT_TIME     2 /* Time of system boot (in ut_tv) */
#define NEW_TIME      3 /* Time after system clock change (in ut_tv) */
#define OLD_TIME      4 /* Time before system clock change (in ut_tv) */
#define INIT_PROCESS  5 /* Process spawned by init(8) */
#define LOGIN_PROCESS 6 /* Session leader process for user login */
#define USER_PROCESS  7 /* Normal process */
#define DEAD_PROCESS  8 /* Terminated process */
#define ACCOUNTING    9 /* Not implemented */

#define UT_LINESIZE      32
#define UT_NAMESIZE      32
#define UT_HOSTSIZE     256

struct exit_status {              /* Type for ut_exit, below */
    short int e_termination;      /* Process termination status */
    short int e_exit;             /* Process exit status */

struct utmp {
    short   ut_type;              /* Type of record */
    pid_t   ut_pid;               /* PID of login process */
    char    ut_line[UT_LINESIZE]; /* Device name of tty - "/dev/" */
    char    ut_id[4];             /* Terminal name suffix, or inittab(5) ID */
    char    ut_user[UT_NAMESIZE]; /* Username */
    char    ut_host[UT_HOSTSIZE]; /* Hostname for remote login, or kernel version for run-level messages */
    struct  exit_status ut_exit;  /* Exit status of a process marked as DEAD_PROCESS; not used by Linux init (1 */
    /* The ut_session and ut_tv fields must be the same size when compiled 32- and 64-bit. 
       This allows data files and shared memory to be shared between 32- and 64-bit applications. */
    #if __WORDSIZE == 64 && defined __WORDSIZE_COMPAT32
    int32_t ut_session;           /* Session ID (getsid(2)), used for windowing */
    struct {
        int32_t tv_sec;           /* Seconds */
        int32_t tv_usec;          /* Microseconds */
    } ut_tv;                      /* Time entry was made */
    long   ut_session;           /* Session ID */
    struct timeval ut_tv;        /* Time entry was made */
    int32_t ut_addr_v6[4];        /* Internet address of remote host; IPv4 address uses just ut_addr_v6[0] */
    char __unused[20];            /* Reserved for future use */

bulk_extractor-rec utmp scanner

According to above definition, each fileld has the following size.

field size
ut_type 4
ut_pid 4
ut_line 32
ut_id 4
ut_user 32
ut_host 256
ut_exit 4
ut_session 4
tv_sec 4
tv_usec 4
ut_addr_v6 16
unused 20
Total 384

ut_type is defined as short (2 byte) but I have confirmed that actual data indicate 4 byte.

To raise the precision of utmp scanner, I have created following rules based on actual utmp records.

  • ut_type: 1-8 (because I have never seen 0 or 9)
  • ut_line, ut_user, and ut_host: printable ASCII characters and end with \x00
  • tv_sec: a positive number
  • tv_usec: 0-999999 (because usec 1000000 means 1 second)
  • unused: \x00..\x00

bulk_extractor-rec utmp scanner search pattern that meets these requirements then carve out to file named utmp.

bulk_extractor -E utmp -o output input
bulk_extractor -x all -e gzip -e utmp -o output input

carving utmp using bulk_extractor

If you want to search pattern within gzipped data, gzip scanner also should be enabled. 

output file

In this instance, both wtmp and btmp have no size but bulk_extractor-rec found some amount of utmp record.


Records in utmp file which bulk_extractor-rec found are not chronological, so last/lastb may show incorrect information. 

The simple python 3 parser I uploaded gist parses utmp with TSV. You can download exe in the following link.

(SHA-256: 60d14a3af0d5c0bf87c836a7c31b0f7c952c28c4916345df35c5c9208b79613f)

parse utmp

Especially, deleted utmp records may contain root cause of intrusion etc. bulk_extractor-rec helps your work!



hello. i am from poland. i uwielbiam was!



  • HTMLタグは利用できません。
  • 行と段落は自動的に折り返されます。
  • ウェブページのアドレスとメールアドレスは自動的にリンクに変換されます。