What on earth is a mashup? I don’t mean mashup in the media-sampling sense of DJ Earworm or Soda Jerk, but rather in the Web sense:
a mashup is a combination of two separate data sources available on the web, into some service or data set. Like the sampling sort of mashup, web mashups tend to be particularly valuable if their are surprising or subversive – but just plain useful is also good.
Here are some examples, in varying degrees of sophistication
Now, let’s look at the ingredients of mashups:
Have a poke around these guys, and see if you can propose some more mashups to me (hint- you can get started by simply picking your favourites off ProgrammableWeb)
So what are the ingredients of the mashup?
- At least two data sets…
- …in a recognised format…
- …and some software to tie them together
1 some data sets
So, what are some datasets you can play with?
- some online databases – maybe too many, you’ll have to delve into these guys!
- a geographic dataset search
- This blog’s feed (can you remember how to find the address from last week?
- Hell, even the GotGastro data is online
- Anything with an RSS feed will do – how about your favourite blog?
- …or something locally relevant
- Want more? These, plus many more advanced datasets are on ProgrammableWeb
There are many, many more, and you can easily roll your own. Even this blog is a dataset.
2 a recognised format
Which brings us to the data formats question… Which I will gloss over for now. Let’s start by saying there are ways to convert between various formats, and some that are harder and some that are easier, and leave the detail until later. RSS, JSON, KML, should be no problem for any of us. Later on we can handle CSV, REST, and maybe more.
3 mashing software
This week’s practical exercise will be built using Yahoo Pipes, and will follow on from then Yahoo’s own examples page. Once we’ve had a play there, however, there are a plethora of other options that don’t even require any code – here are two excellent roundups, one from geek superstar Simon Willison and one from Sitepoint. There are lots of toolkits you can use to do this- google spreadsheets, google maps, xFruits (the only one I don’t regularly use), and dapper are among the most powerful. You might also be interested in looking at building mashups right inside your browser with firefox’s Ubiquity. There are also more complex tools for doing it all, like writing your own code, using maps and javascript…, or the mega YQL datasets
More advanced guides can be found, again, on ProgrammableWeb, or in one of the many guidebooks. (e.g.)
See if you can knock something together in-class. WARNING: things will break, go wrong and simply fail to work. Risks of the trade.
Finally, let’s be thinking about what all this enables. Let’s look at site with (maybe) valuable data that doesn’t enable mashups – the Australian government toilet locator. Can you think how this data might be more useful in combination with other sites? Has anyone else mapped their toilet data? (yes!)
Does anyone want to try to reproduce this gentleman’s amazing local goverment map?
Or: consider which data should be available to mashup. This is a hot topic in Australia right now, and there’s even money in it for you.


What a shame the Austrlian Gov’t wont allow mashups with the National Public Toilet Maps. I love that you can add them to a “my toilets” list…