Extracting CSV summaries by day

Gwern Branwen gwern at gwern.net
Thu Jun 26 22:53:49 CEST 2014


On Tue, Nov 5, 2013 at 5:56 PM, Gwern Branwen <gwern at gwern.net> wrote:
> On Tue, Nov 5, 2013 at 5:22 PM, Joachim Breitner
> <mail at joachim-breitner.de> wrote:
>> ok, try this and tell me what you think of it:
>> http://people.debian.org/~nomeata/arbtt/arbtt_0.7.1-1~pre1_amd64.deb
>
> Looks good. I think I can use it. (R has routines for converting long
> to wide format, so the column vs row thing isn't a really big
> problem.)

Yes, turns out to work pretty nicely. I've been looking at a factor
analysis of my various metrics, and it turned out to be pretty easy to
incorporate the arbtt csv output (once I repaired the log with
arbtt-recover, yet again).

So for my current purpose my workflow goes:

    $ arbtt-stats --logfile=/home/gwern/doc/arbtt/2013-2014.log
--output-format="csv" --for-each="day" --min-percentage=0 >
2013-2014-arbtt.txt
    $ emacs -nw 2013-2014-arbtt.txt # delete before 2 March 2014 and
after 24 June 2014; rename 'Day'->'Date'
    $ mv 2013-2014-arbtt.txt 2014-marchjune-arbtt.csv
   $ R

Then in R we can get a nice clean wide format dataset thusly:

    arbtt <- read.csv("2014-marchjune-arbtt.csv")
    arbtt$Percentage <- NULL # we don't care

    # Convert time-lengths to second-counts: "0:16:40" to 1000
(seconds); "7:57:30" to 28650 (seconds) etc.
    # We prefer units of seconds since arbtt has sub-minute resolution
and not all categories
    # will have a lot of time each day.
    interval <- function(x) { if (!is.na(x)) { if (grepl(" s",x))
as.integer(sub(" s","",x))
                                               else { y <-
unlist(strsplit(x, ":"));

as.integer(y[[1]])*3600 + as.integer(y[[2]])*60 + as.integer(y[[3]]);
}
                                                      }
                              else NA
                              }
    arbtt$Time <- sapply(as.character(arbtt$Time), interval)

    library(reshape)
    arbtt <- reshape(arbtt, v.names="Time", timevar="Tag",
idvar="Date", direction="wide")

-- 
gwern
http://www.gwern.net




More information about the arbtt mailing list