Hard Graph Musings
Blog posts etc
QUESTION: Our performing arts organization implemented a ticket-scanning system at the beginning of season. We now have a complete season of data (six concerts). We are curious whether there is anything we might learn from analyzing the data?
[650 words - 3 minute read]
A quick conversation with the staff reveals that they present pre-concert activities in the lobby. The concert begins at 8 pm, but the pre-concert lobby activities start at 7 pm. They also hold a pre-concert talk in the main hall which happens concurrently. Both of these activities are frequently mentioned in grant applications. Could this data shed some light on how many attendees will come early? If we alter the start time of each activity, will they be able to increase their exposure?
The data was exported out of their database in the following format:
There were actually over 10,000 records (lines of data), but this sample shows you that we can see what kind of ticket (Ticket order Type), the entry date and time, which scanner recorded the entry (Entry Device Label and Entry Device Name), and the concert date and time (Event Instance: Date). I appended a couple of columns to give me some clean data of just the entry time without the date. This enabled me to create a new linked data table (below) showing how many tickets were scanned during each minute prior to the beginning of the concert. The last three columns give me the time that these tickets were scanned, and then the average and cumulative number of tickets per minute. I’ve just put the last 15 minutes (7:45 to 8:00 pm) but I did this going back two hours (i.e from 6:00 pm).
Now this is starting to be more useful. But once we translate this into graph format, it become more revealing. The graph below is a time graph showing how many tickets were scanned during each minute leading up to the concert start time. Each of the six concerts is shown with its own line.
There are a couple of anomalies in the data (Concert 1 (blue line) had 21 people enter at 1 hour 51 minutes prior to concert start, and Concert 3 (purple line) had 44 people enter during the 58th minute before the concert started.) We will leave those in for now -- it is conceivable that either a large group came in at once, or more likely, a scanner malfunctioned and a stack of collected tickets were quickly scanned as soon as it came back online.
As expected, this chart shows us the overall trend of more people coming into the hall nearer to concert start time. It is noteworthy that the peak is consistently around 10 to 15 minutes before 8 pm. But this is just a mess of lines. Once we use the average number of tickets per minute, then add the cumulative column so we know how many people are in the building at a given time, we start to get some more useful information.
Now we can see that we do get a small peak leading up to the 7 pm start of those pre-concert activities. But particularly noteworthy is the red line, which shows us that there are approximately 200 people in the hall at that time. Just 20 minutes later, this has doubled to 400. And it doubles again to 800 at 7:40 pm.
The pre-concert talk is locked in at 7 pm as it occurs on stage at this particular venue and the stage needs to be reset for the performers at 7:30, leaving a 30 minute window for the talk. The pre-concert lobby activities operate under less time restrictions and its format allows for patrons to “come by” during the activity. However, the later that these can finish the more people will experience it.