5. Photographers at the Museum

The V&A began acquiring photographs in 1852, and its collection is now one of the largest and most important in the world. Let’s take a look at which photographers are held in the (catalogued) collection.

To query the photographers in the V&A collections, we need to query object types of “photograph” and “photographs” (this is due to variations in cataloguing names) and cluster the results by ‘maker’. We then show a treemap visualisation of the top 50 results for each.

For a record of the results, we also generate a sample of the photographers and some of their works as a PDF.

Data Visualisation

Now we query the API for the object types and show them as a treemap


import requests
req = requests.get('https://api.vam.ac.uk/v2/objects/clusters/maker/search?kw_object_type=Photograph&cluster_size=50')
treemap(req.json(), "Photographers")


import requests
req = requests.get('https://api.vam.ac.uk/v2/objects/clusters/maker/search?kw_object_type=Photographs&cluster_size=50')
treemap(req.json(), "Photographer")

Saving as CSV

At present we do not have a custom CSV response for cluster endpoints, this is something that might be added in a future version of the API. The response at the moment will return the identifier (‘id’), a descriptive term (‘value’) and the count of matching object records (‘count’).

import pandas as pd
df_photograph = pd.read_json('https://api.vam.ac.uk/v2/objects/clusters/maker/search?kw_object_type=Photograph&cluster_size=100', orient='records')
df_photograph["link"] = "https://collections.vam.ac.uk/search/?id_person=" + df_photograph['id']
id value count count_max_error link
0 A1848 Unknown 11846 0 https://collections.vam.ac.uk/search/?id_perso...
1 A6403 Frith, Francis 4153 0 https://collections.vam.ac.uk/search/?id_perso...
2 AUTH334543 K.A.C. Creswell 3335 0 https://collections.vam.ac.uk/search/?id_perso...
3 N3687 Thompson, Charles Thurston 2244 0 https://collections.vam.ac.uk/search/?id_perso...
4 A5970 London Stereoscopic and Photographic Company 1703 0 https://collections.vam.ac.uk/search/?id_perso...
5 A4801 Stone, Benjamin Sir 1532 0 https://collections.vam.ac.uk/search/?id_perso...
6 AUTH325233 Parker, John Henry 1449 0 https://collections.vam.ac.uk/search/?id_perso...
7 AUTH335751 Thompson, Stephen 1265 0 https://collections.vam.ac.uk/search/?id_perso...
8 A4798 Scamell, George Mr 1035 0 https://collections.vam.ac.uk/search/?id_perso...
9 A5902 Beaton, Cecil (Sir) 959 0 https://collections.vam.ac.uk/search/?id_perso...
import pandas as pd
df_photographs = pd.read_json('https://api.vam.ac.uk/v2/objects/clusters/maker/search?kw_object_type=photographs&cluster_size=100', orient='records')
df_photographs["link"] = "https://collections.vam.ac.uk/search/?id_person=" + df_photographs['id']
import requests
req = requests.get('https://api.vam.ac.uk/v2/objects/clusters/maker/search?kw_object_type=Photograph&cluster_size=50')
top100_photographers = req.json()

Generating PDF sample

A possibly useful way to look at the data, for those who don’t want to just see spreadsheets and data tables, is to construct a PDF with a sample of objects for each photograph. This code (drawn from reportlab documentation) generates a very simple PDF with images of 5 objects for the first 10 photographers.

from reportlab.pdfgen import canvas
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Image, Table, TableStyle
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.rl_config import defaultPageSize
from reportlab.lib.units import inch
import PIL
from io import BytesIO
import requests
import pandas as pd

styles = getSampleStyleSheet()
top100_photographs = ""

Title = "V&A Photographers - Top 10"
pageinfo = "vam-api-data-exploration-5"

def myFirstPage(canvas, doc): 
    canvas.drawCentredString(PAGE_WIDTH/2.0, PAGE_HEIGHT-108, Title) 
    canvas.drawString(inch, 0.75 * inch, "First Page / %s" % pageinfo) 

def myLaterPages(canvas, doc): 
    canvas.drawString(inch, 0.75 * inch, "Page %d %s" % (doc.page, pageinfo)) 

chart_style = TableStyle([('ALIGN', (0, 0), (-1, -1), 'CENTER')])

def build_doc():  
    doc = SimpleDocTemplate("photographers-samples.pdf") 
    Story = [Spacer(1,2*inch)]  
    style = styles["Normal"]  
    i = 0
    for photographer in top100_photographers:
        photographer_name = photographer['value']   
        photographer_id = photographer['id']
        p = Paragraph(photographer_name, style)     
        # Retrieve an image to show, update URL to point directly to thumbnail derivative
        query_url = "https://api.vam.ac.uk/v2/objects/search?id_maker=%s&images_exist=1&response_format=csv&page_size=20" % photographer_id
        photograph_objects = pd.read_csv(query_url)
        IIIF_IMAGE_URL = "https://framemark.vam.ac.uk/collections/%s/full/!100,100/0/default.jpg"
        photograph_objects._primaryImageId = [IIIF_IMAGE_URL % item for item in photograph_objects._primaryImageId]      

        r = requests.get(photograph_objects.iloc[0]._primaryImageId)
        image1 = Image(BytesIO(r.content), width=inch, height=inch)
        if(photograph_objects.iloc[0]._primaryTitle != "nan"):
          object_link = Paragraph('<link href="https://collections.vam.ac.uk/item/%s">%s</link>' % (photograph_objects.iloc[0].systemNumber, photograph_objects.iloc[0]._primaryTitle))
          object_link = Paragraph('<link href="https://collections.vam.ac.uk/item/%s">%s</link>' % (photograph_objects.iloc[0].systemNumber, photograph_objects.iloc[0].objectType))
        r = requests.get(photograph_objects.iloc[1]._primaryImageId)
        image2 = Image(BytesIO(r.content), width=inch, height=inch)
        r = requests.get(photograph_objects.iloc[2]._primaryImageId)
        image3 = Image(BytesIO(r.content), width=inch, height=inch)
        r = requests.get(photograph_objects.iloc[3]._primaryImageId)
        image4 = Image(BytesIO(r.content), width=inch, height=inch)
        r = requests.get(photograph_objects.iloc[4]._primaryImageId)
        image5 = Image(BytesIO(r.content), width=inch, height=inch)
        Story.append(Table([[image1, image2, image3, image4, image5], [object_link, object_link, object_link, object_link, object_link]],
                     colWidths=[inch, inch, inch, inch, inch],
                     rowHeights=[ 0.75*inch, 0.25*inch], style=chart_style))
        if i > 9:
            i += 1
    doc.build(Story, onFirstPage=myFirstPage, onLaterPages=myLaterPages)
req = requests.get('https://api.vam.ac.uk/v2/objects/clusters/maker/search?kw_object_type=Photograph&cluster_size=50')
top100_photographers = req.json()

import altair as alt
import plotly as px

data = px.data.iris()

