So that’s what they sound like…

August 19, 2011 3 comments

On Thursday 11 August, three accomplished musicians and one tone-deaf member of the project team congregated in the new Ensemble Room in Oxford University’s Music Faculty to record some of the pieces which are to be included in What’s the score at the Bodleian?

Our sound engineer chose to use some rather less daunting equipment than this.

The pieces recorded are all by Charles d’Albert, and were selected on the somewhat unscientific basis that the collection is being digitised alphabetically by composer, and d’Albert is by far the most prolific of those composers whose surname begins with the letter A.  That and the fact that we rather liked some of his tunes.

The Faculty provided us with microphones and some portable recording equipment (and instructed us how to use it), as well as – crucially – a piano (for which instructions weren’t required).  Twenty-seven takes and a couple of hours later, we had five piano pieces in the bag, performed with aplomb by Ben Sheen and Tim Hawken, all ready for editing and post-production:

  • Nearest and Dearest (Waltz, on airs from Audran’s comic opera, Olivette), performed by Ben Sheen
  • Trial by Jury Polka (on airs from Arthur Sullivan’s operetta), performed by Ben Sheen
  • The Rink Galop (as performed at the Royal Aquarium, Westminster), performed by Ben Sheen
  • Adelina (on Jules Cohen’s Opera Les Bluets), performed by Ben Sheen
  • The Cleopatra Galop, performed by Tim Hawken

Meanwhile, having been thoroughly entertained by some live performances of parlour songs at the project’s introductory reception in June, and with plenty of studio time left, it seemed too good an opportunity to miss to also record some non-project songs which demonstrate other aspects of domestic music which were popular in the 19th century:

  • Come into the Garden, Maud, by M. W. Balfe (words by Alfred, Lord Tennyson)
  • Home! Sweet Home! by Sir Henry Bishop (words by John Howard Payne)
  • The Lost Chord, by Sir Arthur Sullivan (words by Adelaide A. Procter)

As at the June event, these were sung by Greg Skidmore, accompanied by Tim Hawken on piano.

All of these pieces have now been made available on a new Recordings page on the project’s website, and for each of the d’Albert piano pieces a PDF file of the score itself is also provided.  Once the crowd-sourcing aspect of the project is in full swing, it is hoped that any members of ‘the crowd’ with a modicum of talent on the ivories will make their own recordings of piano pieces delivered through the project and allow us to put them online.


Crowdsourcing for transcription projects

What’s the Score at the Bodleian? featured at an event this weekend. The Crowdsourcing for transcription projects workshop was held at Merton College, Oxford, and saw a group of eager participants gather to hear about and discuss crowdsourcing in the context of transcription projects.

The program for the half-day event opened with three talks:

Arfon Smith (Zooniverse) couldn’t be at the event in person but luckily a podcast was available of his very relevant talk from the Beyond Collections 2011 conference last month. The talk provided a useful introduction not only to Zooniverse but to a number of issues to be considered by anyone setting out to conduct an academic crowdsourcing project.

David Tomkins described the What’s the Score at the Bodleian? project and showed some examples of the kind of material we are working with. He could also share some screenshots from the data collection interface which has just been made available to small group of testers.

Giles Bergel (Bodleian Ballads) talked about the work with ballads that has been going on at the Bodleian for a number of years. He presented some of the problems with the material, such as some text being very difficult to read for both humans and machines, and outlined some ideas for future projects.

The rest of the morning was spent productively discussing thoughts on crowdsourcing in general and ideas about transcription crowdsourcing in particular. It was agreed that this was beneficial, and that we’d like to continue the discussions and exchange of ideas. A first step will be to collect the ideas and references gathered during the day and circulate these, together with suggestions for further activities for the group.

Categories: events Tags: , ,

With a bit of music

June 17, 2011 1 comment
Greg Skidmore and Tim Hawken performing parlour songs

Greg Skidmore and Tim Hawken performing parlour songs

What’s the Score at the Bodleian? was introduced to a number of specially invited guests at a small event in Oxford this week.  About 40 people gathered in the Denis Arnold Hall in the University’s Faculty of Music to hear Bodley’s Librarian, Dr Sarah Thomas, introduce the project, after which three members of project staff each gave short presentations.  Martin Holmes outlined the Bodleian’s extensive music collections and explained some of the problems currently faced in making them more accessible; Ylva Berglund-Prytz gave an overview of crowd-sourcing as a relatively quick and economic means of capturing data, citing in particular some of the initiatives undertaken by Zooniverse; and David Tomkins gave an overview of the project methodology and its potential for additional enhancements in the future.  The presentations provided a platform for much informal discussion about the project and its future directions between guests and project staff over drinks and canapes.

Before and after the presentations, Greg Skidmore and Tim Hawken stole the show with a series of outstanding performances of parlour songs and piano pieces selected from the first batch of digitised scores for the project.

Categories: events Tags: , , ,

Optical Music Recognition

Some time ago, the What’s the score at the Bodleian? project team went to see Matt McGrattan at The Bodleian Digital Library Systems and Services. We wanted to find out what it would take to be able to use our digitized scores to automatically generate a sound file to go with the sheet music, and Matt had been looking into this.

A number of programs exist that will convert images of music into a kind of notation that can be read by computers to, for example, generate a sound file or be used as input into music editing programs. Background reading on the matter had suggested that it was unlikely that our material would convert easily as far too many variables were non-ideal (some references in the Optical Music Recognition Bibliography). We nevertheless wanted to explore what it would take to make it worth-while to include automatic music recognition in the project.

Screenshot of Audiveris interface (from

Screenshot of Audiveris interface (from

The program Matt used for our initial test was Audiveris. Audiveris is an open-source Optical Music Recognition (OMR) tool that can ‘interpret’ music notation and convert it to a form of data (Music XML) that can then be used as input into other programs.

Before we could use the program, our sample file had to be pre-processed (for example making sure it was the right format and size). The file was then loaded into Audiveris and processed as illustrated in the Quick example found on the Audiveris website.

The initial output that we got was not perfect, and what this meant was obvious when the file was used to automatically generate a sound file. Matt suggested it sounded ‘like something by Scott Joplin’. For some kinds of music that is the desired effect, but in this case it was not. It is perfectly possible to post-edit the initial output and manually correct some of the problems, but the time and effort necessary for this means we could not fit it in to the current phase of the project. It is, however, something we want to continue to look into.

This test only included one program (Audiveris) and was performed on only one of our samples. It is possible that other programs will suit our material better, or that this process will be better suited for other types of material. As we are hoping to be able to digitize and make available other kinds of scores in the future, we will continue to explore options for automatic optical music recognition. We’ll report on any further findings as and when we have some.

Sample files

Cover for Abbey House Schottische

'fancy fonts''

As with all digitisation projects, it is important to test your technology on a small sample of material before you finalise your plans. That was, naturally, done also on the What’s the score at the Bodleian? project. Our sample consisted of a few items from the collection, both loose sheets of music (with colour covers and covers without illustration) and a bound volume. Although the currently planned part of this project will be focussing on purely instrumental piano music, we also included some pieces with lyrics in the sample for testing.

The material was scanned at different resolutions and Optical Character Recognition (OCR) was run on the files to see how any text was picked up. The result showed that the material was eminently ‘scannable’ and we received clear and good scans. Not unexpectedly, it was found that in many cases the OCR was not particularly successful when it came to identifying text in ‘fancy fonts’. As many of our covers consist of text in decorative lettering, that means we will not be able to rely on that for the description of the covers. Luckily, humans tend to be able to read this kind of text without too much effort, so it shouldn’t be difficult to decode for the people contributing to the project later.

Categories: project progress Tags: , ,

What’s a duplicate?

March 15, 2011 2 comments

As we are going through the boxes, we are identifying duplicates, the idea being that we do not need two identical copies of the same item. But what is ‘the same’? It may seem obvious at first – if it is the same piece of music it is a duplicate. But what if it is a different edition, where some changes (may) have been done to the music? Well, then it is not an identical copy and thus not a duplicate. But what if the music is the same but the cover differs?

We have taken the view here that if the cover is different, the items are not duplicates, even if the music would sound identical irrespective of what copy you play it from. The reasoning behind this is that these items are not only music scores. The actual physical copies are interesting, and variations there can very well be of interest to someone researching the genre or period.

The differences between two versions of a score can be quite obvious, like the Valentine Galop pictured here.Different covers for Valentine Galop The covers look different – one has an illustration while the other uses different fonts in a decorative way – which may make it less obvious that this is actually the same music. It is the same composer (although called M Relle on one cover and Moritz Relle on the other), but the title is slightly different (St Valentine’s Galop vs Valentine Galop). It is only by looking at the actual music notation that we will know if it is the same piece. In this case it is easy to motivate scanning both copies, since there is so much to look at and compare for someone researching the area.

In other cases, duplicates may be less obvious. It may be that the cover looks very similar, but a closer inspection reveals small differences, for example that the advertisements on the back are different or that the list of titles in the series contains different number of items. If these were to be considered duplicates, which one should be scanned? Who should decide that one set of adverts is more important or interesting than another? We have refrained from making that decision and are instead scanning both copies in cases like these. This will allow different kinds of research on the material. The actual number of near-duplicate scores is fairly low, so seen in the grand scheme of things scanning the near-duplicates it is a small extra. Having them does however also allow a further interesting use, namely for quality assurance. Having the same title described twice will allow us to make comparisons between the different descriptions and see in what way they differ (if at all). That will help us understand how much variation we should expect in the descriptions that we are getting. There are other ways this quality assurance can be performed, and we will be using various methods to get material that is truly useful for those who wish to make use of it.


Some of the boxes containing our scores

Some of the boxes containing our scores

As we were working away on our boxes (I had just finished counting no 39 of 64), we heard the fire alarm. After a short while it became obvious that this was not a test or brief error but the bell was chiming steadily and we had no option but to leave the building. I hated doing that – leaving all our boxes behind. What if it really WAS a fire? What would happen to my galops and waltzes and beautiful covers? I had to fight an urge to carry them all with me – I didn’t even take the Wedding Valse. What shall I now do if I come back and discover it is all in cinder? At least I have some photographs to remind me of what the boxes looked like…

On a more serious note, although this incidence turned out not to be a real fire, it highlights how important digitisation really is. By digitising material we will be able to use it and rejoice in what it has to offer even if we cannot access the original physical copies.

Categories: project progress Tags: ,
%d bloggers like this: