Sunday, April 27, 2008

How to find the cheapest train tickets by website scraping

One of my niggles with train timetable websites is that it can be very tricky to find the cheapest train ticket if you want to ask questions like "What is the cheapest ticket if I could travel on Sunday, Monday or Tuesday next month?" or even worse "What would be the cheapest 1st class ticket if can travel any weekend in June?". If you phone up a train hotline they would probably give up (and you are normally paying 10p/min while they try) or they would guess and might miss the best deal.

My solution was to use iMacro. This is a macro recording plugin for Firefox that can automate interactions with a website. I can now set up an initial script and set the browser to look for the cheapest possible ticket.

For example, below are the results for London to Nottingham travelling in the mornings (9am to 1pm) from 3rd to 5th May searching for the cheapest direct train ticket in any 2 hour slot. It took 20 seconds to configure the script (after taking a day to develop it) and my computer took 5 minutes to run the search as it has to enter data on the timetable website 12 times (3 days x 2 time slots x two classes of travel) and wait each time for the website to refresh with new data. I chose the Virgin Trains website as it seemed to have quite a nice interface which made it easy to pick the first radio button to find the cheapest ticket and seemed to return the most comprehensive results (I kept finding discrepancies on other sites, which surprised me as I assumed they shared the same underlying data).
Results for London to Nottingham 
[click to show longer example]

Searched on Sun, 27 Apr 2008 12:44:50 GMT

Saturday 3 May 2008
Slot | Time | Std | Time | 1st
09:00 | 08:55 | £ 15.00 | 08:55 | £ 18.00
11:00 | 12:55 | £ 15.00 | 11:55 | £ 18.00

Sunday 4 May 2008
Slot | Time | Std | Time | 1st
09:00 | 09:00 | £ 15.00 | 09:00 | £ 18.00
11:00 | 11:00 | £ 15.00 | 11:00 | £ 18.00

Monday 5 May 2008
Slot | Time | Std | Time | 1st
09:00 | 08:55 | £ 11.00 | 08:55 | £ 18.00
11:00 | 10:55 | £ 11.00 | 10:55 | £ 18.00
I have tested with very long queries taking more than an hour to run and the only problem may be time-outs from the website. In these cases iMacro appears to lock but by pressing the Pause/Resume button will continue without losing data. Note it is best not to use the browser when running an iMacro script but I have successfully used a different browser (Safari) at the same time without any problems.

The macro is run from a javascript file in iMacro which calls iMacro commands (previously I passed variables to an iim file but it seems easier to put it all in one file) and I tweak the "Set up query" section of the javascript to query a particular train route, this could easily be set from user prompts. For the time being it is always cheapest to get two singles rather than a return so I've only bothered writing the macro for a single. Note that the slot hours (set in array myhour) are based on the maximum number of hours that Virgin trains will display in the particular train route. For London/Nottingham this is 2 hours and for London/Redruth this is 3 hours. Nice bonus functionality I've included are returning the date in long text form from the Virgin site and keeping the iMacro code display updated telling you the estimated time left.

The latest version opens a results window to show the data in an html table. You can use iimDisplay() but it is limited to a tiny window and was (at the time of writing) not resizeable by scripting.

Here's the source code. Note that the source code is not word wrapped and long lines may appear truncated, but if you cut & paste to your editor you should see all the text.

JavaScript source code Virgin-Trains.js (click to show)

Monday, April 14, 2008

How to use free GPS on a K750i mobile phone

Last year I bought a tiny bluetooth GPS device off ebay for £20. I set up my laptop with it and we managed to drive around Germany using a copy of Autoroute showing our planned route, current GPS coordinates and it even shouted helpful instructions such as "turn left in 400 meters"! We planned our route in the hotel the night before and as we are often looking for poorly road-signed ancient sites it is terribly handy to have the confidence of the computer pointing you down an empty track rather than guesswork.

I did get the GPS recognized by my Sony Ericsson K750i (due to a previous hack it runs using the W800i firmware) and used some trial software to show coordinates but little else. I spent some time yesterday further researching and managed to get it working quite beautifully yesterday.

The simple free java application is called TrekBuddy and it has an associated wiki site. After connecting to the GPS via bluetooth, the application shows where you are on a scrolling map that you define and store locally on the memory stick. I have it working with an almost A to Z street level map of London (spanning West Hammersmith through to Catford). You can load other maps as you go along so a later improvement would be to install a higher level road map of the UK. In the case of my map of London it was only 2mb in size (I use a 4gb stick!) so it would be no issue to carry around several maps.

I had to spend some time tweaking the configuration, in particular the initial location of the maps had to be entered by hand to point to the memory stick ("file:///E:/other/mapdata/london") and it took a while to work out that I needed to tick the option to enable a large atlas in order to display anything. The only other hard part was preparing the map, luckily someone has worked all this out and I imported maps from Open Street Map (setting default datum as WGS 84) using the tool listed on the wiki. I also managed to import using the rather neat Google Maps to TrekBuddy tool on the same page. The latter being slightly more limited as to the overall size of map but with the benefits of a better quality street map (just right for a pedestrian map in Central London).