Reimplementing the Tree Command

I decided I’m going to try to reimplement the tree utility from scratch in a very limited version. Mine will only work from the current directory and will support no arguments. This article will take you through my implementation of the problem.

The tree program produces a diagram of the directory structure from a given starting point, generally the current working directory. You can see an example in the screenshot below.

Tree output

My version is much simpler. It does not implement:

Sorting of each subdirectory result.
Colored output based on file type by dircolors
Command arguments to configure behavior.
Multiple output formats.

The first order of business is to open a directory and read it’s content. For that we need opendir and readdir. These functions from the posix standard. opendir opens a directory stream, and readdir reads from that stream to produce a dirent structs. These structs contain information about the current entry, such as the name and type of file.

// from http://man7.org/linux/man-pages/man3/readdir.3.html
struct dirent {
    ino_t          d_ino;       /* Inode number */
    off_t          d_off;       /* Not an offset; see below */
    unsigned short d_reclen;    /* Length of this record */
    unsigned char  d_type;      /* Type of file; not supported
                                   by all filesystem types */
    char           d_name[256]; /* Null-terminated filename */
};

The next task is to separate out the results of readdir from what we want. We need to be careful of the current directory entry, and the previous directory entry, if we recurse on those directories we could get an infinite loop during our in order traversal. We also need to filter out hidden files, because the normal find does not include hidden files by default. To do this, we examine the contents of the dirent.d_name field.

The next step is to recurse on entries that contain more entries. According to the man page when d_type is equal to DT_DIR, that entry is a directory, and we should recurse on them. But we run into a problem, we don’t have the full path. We could do string concatenation, and build on each step, but I didn’t feel like it. Instead I used chdir to move our context to that directory, and drop down once we finished up. This may not be very efficient(system calls), but it avoids buffer overflows, and C-style strings are miserable.

The final piece we must consider is counting up the number of files and the number of directories. I used a struct containing the counts, and returned them from each layer of the walk in the classic recursive way of building up a solution.

The important takeaways are:

Be careful of how you walk a directory, you need to ignore the current and parent directories.
Watch out for exhusting the available file descriptors for your program. You could do this if you try to go too deep or end up with an infinite loop.

Below is the finished version of this code. It’s quite simple as you can see.

/*
 * tree.c
 * Henry J Schmale
 * September 30, 2018
 *
 * A trivial reimplementation of the tree program with no support for
 * arguements. Works from the current directory, and produces a nice
 * ascii diagram of the directory structure. Once it has fully walked
 * the directory tree it prints out some stats.
 *
 */

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <dirent.h>
#include <unistd.h>

const int INDENT_SIZE = 4;

typedef struct {
    size_t num_dir;
    size_t num_files;
} dir_res_t;

/*
 * unix_error - unix-style error routine
 */
void 
unix_error (char *msg)
{
    fprintf (stderr, "%s: %s\n", msg, strerror(errno));
    exit (1);
}

static inline int
self_ref (const char* s) {
    return strcmp (s, ".") == 0 || strcmp(s, "..") == 0;
}

static inline int
hidden (const char* s) {
    return s[0] == '.' && s[1] != '\0';
}

dir_res_t
walk_dirtree (const char* start, int depth) {
    struct dirent* dir;
    dir_res_t stats = {0};

    DIR* dirstream = opendir (start);
    if (dirstream == NULL) {
        unix_error ("Failed to open dirstream"); 
    }
    while ((dir = readdir(dirstream)) != NULL) {
        if (self_ref (dir->d_name) || hidden(dir->d_name))
            continue;
        // print directory name indented
        printf ("%*c%s\n", depth, ' ', dir->d_name);
        if (dir->d_type == DT_DIR) { 
            if (chdir(dir->d_name) == -1)
                unix_error ("Failed to go deeper");

            dir_res_t d_res = walk_dirtree (".", depth + INDENT_SIZE);
            stats.num_files += d_res.num_files;
            stats.num_dir   += d_res.num_dir + 1;

            if (chdir ("..") == -1)
                unix_error ("Failed to go back up");
        } else {
            stats.num_files += 1;
        }
    }
    if (closedir (dirstream)) {
        unix_error ("Failed to close dirstream");
    }

    return stats;
}


int 
main (int argc, char** argv) {
    dir_res_t stats = walk_dirtree (".", 0);
    printf ("%d directories, %d directories\n", stats.num_dir, stats.num_files);
    return 0;
}

Reimplementing the Tree Command

2018 September 30

A selected list of related posts that you might enjoy:

← Previous Post

Whiskey Review - Southern Comfort

Next Post →

Temperature Monitoring on Linux

Reimplementing the Tree Command 2018 September 30