The Painter Goblin: Part 3, Data Sources
One thing that held the Painter Goblin project back was finding a data source to get images from.
There are potentially hundreds of sources out there, but! The path of least resistance means that:
- Any source needs either hackable URIs** (uniform resource identifier) or a randomizing function.
- Ideally, a data source doesn’t link to yet-another-page, e.g. portal like websites to other’s collections.
- Ideally the data source links directly to an image to download.
- Data can be easily selected by category, e.g. just paintings, or posters, not just ‘art’.
** A hackable URI is a URI pattern that can be cycled through using computational techniques, even if the underlying data isn’t entirely well-known. E,g, http://example.com/image/0001, http://example.com/image/0002, for subsequent pages, for lack of a more concrete example.
I wanted to explore heritage sources such as Europeana, TROVE, DPLA. I struggled to search these effectively though, and struggled to see how I might automate using them. I recognise they have APIs. I’ll revisit them in the future as I look to expand the Painter Goblin’s corpus.
Enter Wikidata.
![]()








