I’m using an endpoint to fetch 20 of an artist’s related artists. Once I get that data, I process it further and then build a graph. Spotify doesn’t provide more than 20 artists IIRC, and there’s no way to specify how many levels deep.
So, for example, if my API gets a search term “Lorde”, it would first check to see if Lorde had been processed before. If not, it’ll call Spotify’s API to get Lorde’s Spotify ID (and save it to the DB), then fetch her related artists (and their related artists and their related artists… up to n-levels deep) and build the graph. So what I end up with in the database is similar to what you described. This way, later on, if I want to listen to random songs based on Lorde’s related artists, my app can generate a list of songs based on these connections. This is pretty much what I’m doing manually so I wanted to automate that process and make it less tedious.
I’m thinking that things could slow down very fast if I have to go down many levels every time. If I only did Lorde up to 2 levels the first time for example, and then someone else wanted to go up to 5, all I’d have to do then is get Lorde’s level 2 artists and start from there instead of having to start all the way at the top with her again.
To keep the data in my DB relatively updated, I was planning on having a background job that pings the API once every x days to check and update my data if needed. But maybe this is a poor solution. Would you recommend simply storing the Spotify ID and name of the resources (artists, albums, songs) and fetch everything else on a request by request basis?