Mountain pictures from my last holidays

Here they are, as I just finished to sort them out:
IMGP1274

IMGP1209

Problem with big cameras is they are big and expensive, so I did not want to take mine skiing. Meanwhile, its when you ski, when you explore the mountain that you get to see fantastic mountain landscapes. Plus, when I ski, I am not in the mood of stopping by and wondering what picture would be interesting to take, how I should parameter the camera… So these two picture were taken in my after-ski time. Still, taking photos was a nice way to spend my spare time.

Posted in mountain, photos | Leave a comment

I am a flickr user

I am taking advantage from this Xmas downtime to start publishing some of my photos on the web.

I am using a combination of tools to make it simple to publish:

  • digikam for photo management, tagging, editing.
  • imagemagick for resizing, adding copyright at the bottom of the images
  • flickrfs for publishing.

All of the pictures published here are public. I must say that I am very concerned about privacy on the web and tend to consider that if it is on the web, then it is public. So you’ll never find pictures of me or my relatives here. It is not public, it is not on the web. Arf, so old school…

Posted in photos | Leave a comment

Humanity is just a big statistical database

In the past 2 years, we have seen some applications emerging with ideas and concepts coming from Business Intelligence but for individuals. They are applications which provide analytical functions about some specific area of individual’s life, as personal finance analysis or more recently personal fitness tracking. I am stunned to see this business behavior, which consists in quantifying every possible thing, spreading to the individual. (With the exception that companies own their own data, not the individuals tracking their expenses on a remote server, but this is another story).

Take the case of fitbit. This is an example of data not only targeted to, but also about the individual. To my mind this initiative has an interesting potential, not really for the user (this is my personal view, one might feel differently), but for the people who will have access to the statistics of the thousands of users, provided thousands of users… of course. But the fitbit team can use the data to perform unprecedented studies with sociological, health, commercial purposes, taking advantage from the data taken right from the human body.

If these companies make their way and conquer enough users, they have the potential to be a real blast. After all, google is making money while people are writing emails, why fitbit shouldn’t make money while people sleep? Like, hum, this group of people hasn’t slept more than 4 hours in the past 4 days and they live in California and are in their 20 sthing, let’s try and sell them some energizing beverage, and this other group is over 70, let’s entice them to invest in sleeping pills.

These examples clearly show a move of BI technologies and practices towards individuals both as a source of data and as a target. Are we witnessing the birth of Personal Intelligence?

Posted in Business Intelligence, opinion | Leave a comment

Holiday photos

after-sunset-iledere1.jpgsunset-iledere.jpg

port-iledere.jpgboats-clouds-iledere.jpg
Location: Ile de Ré (France)
Date: September 2008

I just want to share some photos I took lastly with my brand new camera. They were taken on the Ile de Ré, an island west of France.

For the 2 sunsets,  I wanted to underline the wonderful range of colors in the sky.

On that same night, the port was interesting for the reflections in the water as the weather was very calm. Also the lights from the restaurants gave the whole place a warm touch.

At last I chose black and white to stress the “calm before the storm” atmosphere.

Posted in photos | 1 Comment

My own backup script in bash

Remember a while ago I was writing about backup solutions to work under linux and concluded that a bash script was what I needed. Well, here it is, at last.

Invocation

Main functionnalities and invocation scheme are described in the -h option:

Usage: kipit.sh [actions]
  Performs actions sequentially as specified.
Actions:
  full start a full backup
  incremental start incremental backup since last full backup
  clean remove all backups but the last full backup
  send send backups to sftp server
  shutdown shutdown
Example:
  kipit.sh clean incremental send shutdown

Script format

The script is called kipit.sh.
It loads a config file, which is also a bash script that just sets variables.
It uses an exclusions file which contains files and directories not to backup. This file is referenced in the config file and is used by the --exclude option of tar.
Each functionality of the script is embedded into a function.

Now let’s get to the actual functions:

Perform a full backup
I just use tar to backup the directory $DIRECTORIES, that I gzip compress into a .tar.gz file. Of course I exclude the directory that contains the backups and some direcories listed in the $EXCLUDE_FROM file.

# Full
function full {
echo -n "Starting full backup..."
tar --exclude=$TARBALL_DIR --exclude-from=$EXCLUDE_FROM -czf $TARBALL_DIR/${TARBALL_BASENAME_FULL}_${TIMESTAMP}.tar.gz $DIRECTORIES
echo "OK"
}

Perform incremental backup, since last full backup
First we get the name of the last full backup tarball, which requires some *nix filters.
Then we print a list of files that are newer that creation time of the last full backup. We print that to stdout.
This stream is used by tar. This tar command is the same as for the full backup except for the -T - which tells tar to take the list of files from stdin.

# Incremental
function incremental {
echo -n "Starting incremental backup..."
# last full backup tarball
last_full_tarball=`ls -lt ${TARBALL_DIR}/*${TARBALL_BASENAME_FULL}*.tar.gz | grep -v "^total" | head -1 - | tr -s ' ' | cut -d' ' -f8`
new_tarball=$TARBALL_DIR/${TARBALL_BASENAME_INCR}_${TIMESTAMP}.tar.gz
# archive only files newer than ctime of the last full backup tarball
find $DIRECTORIES -cnewer $last_full_tarball -type f -print | tar --exclude-from=$EXCLUDE_FROM --exclude=$TARBALL_DIR -cz -T - -f $new_tarball
echo "OK"
}

Tidy up the house!!
The goal is to remove useless backups in order to keep only the last full backup and the following incremental backups. It saves space and you don’t need to keep those old backups locally. We are not making a time machine here, we are just trying to protect from crashes.
We find the tarballs which are not newer (! -newer) than the last full backup and we print their name on stdout. Each line of which is passed on to rm -f via xargs.

# Clean
function clean {
echo -n "Cleaning..."
# remove tarballs older than last full backup tarball
last_full_tarball=`ls -lt ${TARBALL_DIR}/*${TARBALL_BASENAME_FULL}*.tar.gz | grep -v "^total" | head -1 - | tr -s ' ' | cut -d' ' -f8`
find $TARBALL_DIR ! -newer $last_full_tarball -type f -name "*.tar.gz" ! -name "`basename $last_full_tarball`" -print | xargs rm -f
echo "OK"
}

Send this far away
What we want to do is send all the backups we gathered locally to a remote location, as a sftp server for instance.  I chose rsync because you can choose to transmit only files not already present remotely and it can restart a transfer where it stopped. This is cool because backups are large files and transfer errors are likely to happen. Say you are in a train station, using a public wifi access point. You start to send your backup home and suddenly you have to hurry up, pull the plug and jump on the train, losing connection. Aha! You **need** to recover from partial transfer!
So, --progress prints info on the progress status of the transfer. --partial is for recovering from partial transfers. -a is for archive mode, like in cp -a. -v means verbose. -h means human readable. -e is to specify the remote shell we use.

# Send tarballs
function send {
echo -n "Sending tarballs..."
rsync --progress --partial -avhe ssh $TARBALL_DIR/ $RSH:$REMOTE_DIR/
echo "OK"
}

Shutdown

#shutdown
function shutdown {
echo "Shuting down..."
shutdown -P 0
}

Future of this code
Wish list:

  • Backup multiple directories
  • Have the credentials for the remote server stored in some way
  • Insert some controls on loading the config file, because for now anything is executable and we are root…ahum…
  • Use a config file from specified from the command line

What this script will never be
I do not need an automatic backup solution. I do not need that a server thinks for me when I should do my backups. I want to decide when it is time for me to backup. It’s like that. Because it is a much simpler solution (the remote server only needs to accept sftp transfers, that is just a basic ssh setup which should already be there), because I get to decide when I want to spoil my bandwidth with backup transfers, and last but not least because my laptop is seldom up.
To sum up KISS is the principle that prevailed.

Download


Zip Icon

Posted in backup, bash, development | 1 Comment

Further down the spiral: Geeking out at Mc Donald’s

Yes, I must confess that I totally cracked. I do not have internet access at home and the situation has been going on since we moved to a new apartment. It takes some time for ISPs to bring a line up…

I wasn’t able to blog, to update my machine, to get fresh content from the web. And today, I am trying to make it up for all this time spent away from the good old web: I am sitting at a Mc Donald’s, taking advantage from the free and unencrypted wifi access. Oh my god, I must look like a geek!! ;-)

My machine is already happier, she (yes it is a she) is getting fresh blood from ubuntu package repositories and she got new software to interact with gps hardware and stuff. Ahhh…. I am relieved (temporarily).

Posted in Uncategorized | Leave a comment

Stuff Warehousing (step #3: Implementation)

Now that the big picture is clear. Let’s get down to the actual realisation of this stuff warehousing infrastructure. Some similarities with IT engineering:
Tools:

  • you need sql, a file editor…
  • you need a saw, screwdrivers…

Bricks:

  • planks, screws, cleats
  • an ETL, a DBMS, etc…

Process:

  • have to create the tables before designing the loading jobs
  • have to build the main structure before adding the shelves

Anyways, here is a result I am proud of:
dsc00261.JPG

You should know that this piece of work successfully passed the integration tests, user acceptance tests and it handles the load pretty well.

Well, I guess this is the end of this series of posts about my diy weekends and the desperate need for numeric activities they foster.

Posted in DIY, Warehousing | Leave a comment

Stuff Warehousing (step #2: General and Detailed Specifications)

Lastly I started a series of articles about my first DIY closet building project alongside with personal reflections on the parrallels between a stuff warehousing and a data warehousing problematic (which might be a part of the answer to why I enjoy building this closet…). Now that the need for a suff warehousing solution is revealed, I am going to apply a standard software project management method to this stuff warehousing project. Let me expose my General and detailed specifications.
Here it is:

General requirements:

  • It has to store shoes, hung coats, food, and other various things.
  • It has to fit in the available space.
  • It has to be cheap.
  • It has to be reusable if we move out.

Detailed Specifications:
specs_placard.jpg

Yes, that’s all. It is the old school version of software development: there are 2 users, one of which is also the designer and the coder (/me). So basically, it’s all in my head.

Coming soon: Stuff Warehousing (step #3: Implementation).

Posted in Business Intelligence, DIY, Warehousing | Leave a comment

Lltag: Interactive command line mp3 tagging

I am trying to build a personal music warehouse from various sources, my CDs, borrowed CDs, collected mp3s and so on. This project, once achieved, will make my life sound much better as I’ll have my music available to my listening both in house and outside thanks to my kurobox hardware. One of the aspects of this project is getting correctly tagged music.

lltag --cddb --rename "%n - %t" *.mp3

is THE command line. Lltag is an interactive command line utility that you can invoke as above. Here I am telling lltag to use cddb to fetch the information and then to rename the tagged mp3 files according to the scheme “Track number – Song title”. Then the program asks you for keywords and works like google: you can type the artist’s name or the album’s title and you get a list of artists/albums from which you can choose. You have a real helpful help menu. With some practise, it takes no more than 30 seconds to tag an album.

I prefer this tool to easytag because easytag finds cds based on an id computed from the length of songs. And sometimes the ripping program is 1 second off. Hence the id is not valid, hence you cannot tag your album.

So overall I think lltag works better and is perfectly suited to tagging ripped cds.

 Have it a try!

Posted in home server, music, software, tagging | Leave a comment

Stuff Warehousing (step #1: Detecting the need)

My job in the Business Intelligence field often involves dealing with data warehouses, which are big databases organized to suit reporting and data analysis needs, it is an organised place to store data.
Just like a closet is an organized place to store stuff in the house. I would like to stress the parallel between building a data warehouse (which I know how to do) and building a closet (which I will soon know about as well). I actually need to build 2 of them for my new apartment.

First of all you detect there is a need for something:

  • We have all our data spread out there in excel spreadsheets, we have to do something about it.
  • we have all our stuff spread out there at our parent’s home and all over the apartment, we have to do something about it.

Then you go see your users and set up the general requirements:

  • I want all my data in there, available when I want and easily so I can still use my excel spreadsheets to do secret calculations that I do not want the others to know about.
  • I want all the stuff that wanders in the house in there. And I don’t want to use a stool. And I want to pick my dress without having to bend or to stand on my tiptoes.

Meanwhile you need to perform technological watch:

  • What skills do I need, what are the different approaches to building this thing?
  • What amount of effort, money and time do I have to put in it?
  • What are the solutions available on the market? What are their advantages, disadvantages according to my needs? What is their price? What’s their compared ROI? (nooo, come on! nobody ever does that in data warehousing :-) )

Then you have to start confronting the needs with reality, you have to show your users what is possible, you have to make them talk and drive them to a solution that you know feasible and affordable. The key here is to keep them involved by making the product real, by taking their needs into consideration:

  • ok the rod will by 1.4 m high but you’ll have some lost space beneath.
  • ok you’ll have your data daily but we can’t keep 10 years of daily data.
  • how about we have a geographical dimension to analyse how our clients are spread on the territory
  • how about we have a 2 columns closet, one with shelves and the other to hang clothes.

Coming soon: Step #2: General and detailed specifications.

Posted in Business Intelligence, DIY, opinion, Warehousing | Leave a comment