Spotify Database Schema

Screen Shot 2016-08-23 at 4.15.47 PM

I created this db schema for Spotify as an exercise. 
My goals were to maximize flexibility, time complexity for searches and
to maximize readability, simplicity of design and to maximize joins.

-Each User has many events, playlists, followers and users they are following.
-Each Artist has many songs, events and keywords
-Each Song has many artists, many genres, many albums, many playlists and keywords
-Each Album has many songs and many keywords
-Each Genre has many songs, many keywords and many events
-Each Playlist belongs to many users and many keywords

It would be easy with this design to quickly provide keyword searching on 
all of the categories represented in the data set at Spotify.
Advertisements

Data Flow in We Go Too

WeGoToo.001
I designed We Go Too as a data driven app.  Users create our data by recording
or plotting a route on our react native map. Routes are searcheable by keyword 
and group events can be created to share a route in real time with friends. 

We have lots of plans for this app to grow and scale into niche markets from
school field trips and massive games of urban tag, to tour group management and 
treasure hunts. Our flexible structure and forward thinking data design will 
allow us to meet the needs of a growing client base without difficulty!

What is audio fingerprinting​?

Here’s how the fingerprinting works:

When you’re humming a song to someone, you’re creating a fingerprint because you’re extracting from the music what you think is essential (and if you’re a good singer, the person will recognize the song).

You can think of any piece of music as a time-frequency graph called a spectrogram. On one axis is time, on another is frequency, and on the 3rd is intensity. Each point on the graph represents the intensity of a given frequency at a specific point in time. Assuming time is on the x-axis and frequency is on the y-axis, a horizontal line would represent a continuous pure tone and a vertical line would represent an instantaneous burst of white noise.

Human ears have more difficulties to hear a low sound (<500Hz) than a mid-sound (500Hz-2000Hz) or a high sound (>2000Hz). As a result, low sounds of many “raw” songs are artificially increased before being released. If you only take the most powerful frequencies you’ll end up with only the low ones and If 2 songs have the same drum partition, they might have a very close filtered spectrogram whereas there are flutes in the first song and guitars in the second.

Here is a simple way to keep only strong frequencies while reducing the previous problems:

For each FFT result, you put the 512 bins you inside 6 logarithmic bands. For each band you keep the strongest bin of frequencies. You then compute the average value of these 6 powerful bins. You keep the bins (from the 6 ones) that are above this mean (multiplied by a coefficient). The last step is very important because you might have an a cappella music involving soprano singers with only mid or mid-high frequencies
a jazz/rap music with only low and low-mid frequencies

Deploy first.. code later


Lets face it… if your project doesn’t work on anyone else’s machine you have much bigger problems that a buggy button.

It’s really important to start a project by deploying first. If you think you’ll have time at the end, you won’t. It’s inevitable that a project will push you to the limit of your time constraints and when your boss wants to see what you’ve accomplished, it’s so much better to send a deployed version missing a few things than something which has yet to work anywhere but on local host. If you wait until the last day or the last hour to deploy, you will be totally devastated when you run into any issues.

And on a very positive note.. if all goes well, you’ll have time to add that super cool animation or extra feature!

Key issues that can plague deployment late in the game are:
* Dependencies
* API callback addresses
* Environment variables
* Real world testing environment