Data processing with the Linux tools


If you prefer Python [7] over Perl, check out the script in Listing 3, which is similar to Listing 2, except it uses a few tricks in the form of modules. Lines 2 and 3 state that the script will use predefined modules fileinput [8] and re [9], making it possible to process the input and output easily and use regular expressions.

Listing 3


01 #!/usr/bin/python
02 import fileinput
03 import re
04 total = 0
05 for line in fileinput.input():
06   if re.match('\d+\t+.*\t+.*\t*\d+', line):
07     columns = re.split('\t+', line)
08   total += int(columns[0]) * int(columns[3])
09 print("total: %i miles" % (total))

The variable total, defined in line 4, is initialized with a value of zero. The for loop (lines 5-8) processes the driving log and iterates line by line over the input stream.

The data stream exists as a list because of the call to the fileinput.input() function. The content of the list is made available either via STDIN or from the file that was provided as an invocation parameter.

With the help of a regular expression, line 6 then checks to see whether the lines read conform to the desired structure. The only lines considered are those comprising one or more digits followed one or more tab characters, zero or more arbitrary characters, one or more tabs, zero or more arbitrary characters, zero or more tabs, and finally one or more digits; this regex automatically skips over the header line for the driving log.

Again using regex, the script divides the lines into individual columns with tabs. These columns are used to calculate the total distance (line 8) by summing the distance per trip values into total, resulting in an integer value through explicit conversion of the column values with int().

Line 9 produces the final output in the form of total distance. You have the same possibilities for calling the script as for the Perl version referred to previously:

$ python drivinglog.txt
Total: 1740 miles
$ ./ drivinglog.txt
Total: 1740 miles
$ cat drivinglog.txt | python
Total: 1740 miles
$ cat drivinglog.txt | ./
Total: 1740 miles


The tool command language (Tcl) [10] might seem out of date, but its capabilities for processing text files remain applicable to today's world. The script in Listing 4 also relies on using STDIN for input and regular expressions.

Listing 4


01 set totaldistance 0
02 while {1} {
03   set line [gets stdin]
04   if {[eof stdin]} {
05     close stdin
06     break
07   }
08   set fields [regexp -all -inline \[^\t\]+ $line]
09   if {[string is integer -strict [lindex $fields 0]]} {
10     incr totaldistance [expr [lindex $fields 0] * [lindex $fields 3]]
11   }
12 }
13 puts "Total: $totaldistance miles"

After defining the totaldistance variable and initializing it to zero (line 1), a while loop (lines 2-12) reads from STDIN (line 3) as long as input data is available. The loop exits when an end of file (eof) condition occurs (lines 4-7).

In line 8, the script separates each line into individual columns with the help of a regular expression using one or more tabs. The fields variable is a list, in which each element represents a column in the driving log.

A check occurs in line 9 to see whether the character sequence in field 0 matches an integer number. If yes, then a header is not involved. The expression in line 10 multiplies fields 0 and 3 together (list index beginning with zero, so the first and fourth columns) and adds the results to the total sum. Note that the incr() statement accepts a second parameter containing a subtotal.

After executing the body of the loop, line 13 outputs the total distance. The Tcl script expects the driving log via STDIN; therefore, you should use the following invocation to execute the script:

$ cat drivinglog.txt | /usr/bin/tclsh distance.tcl
Total: 1740 miles

Buy this article as PDF

Express-Checkout as PDF

Pages: 6

Price $2.95
(incl. VAT)

Buy Raspberry Pi Geek

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Connecting a weather station to your Arduino

    After losing one weather station to tropical winds, the author reboots and designs a PCB that connects to an Arduino and monitors weather instruments.

  • Tracking airplanes in real time with ADS-B

    Airplanes continuously broadcast signals that identify the aircraft and its current flight path. With a moderately priced receiver and a Raspberry Pi, users can receive ADS-B transponder data in real time.

  • A home intrusion detection setup (sort of)

    At least part of the popularity of the Raspberry Pi can be attributed to its high maker value; that is, a skilled maker with a Pi can build marvelous and beautiful things. Me? Not so much, but I was willing to try to build a home security system with the stuff in my junk box. Here's what happened …

  • Graphical displays with Python and Pygame

    As its name implies, Pygame is a set of Python modules designed to write games. However, many Pygame modules are useful for any number of projects. We introduce you to a few Pygame modules that you can use to create custom graphical displays for your project.

  • Using a Raspberry Pi to make a hamster pedometer

    Researchers assert that hamsters run the equivalent of four marathons per night. We tested this with the help of a converted playback head from a video recorder, a hall sensor, and a Raspberry Pi.