diff --git a/spotify_similarity_search/Spotify Similarity Search - Part 1 - Distance Based Search.ipynb b/spotify_similarity_search/Spotify Similarity Search - Part 1 - Distance Based Search.ipynb
index 0347cec..4fff031 100644
--- a/spotify_similarity_search/Spotify Similarity Search - Part 1 - Distance Based Search.ipynb
+++ b/spotify_similarity_search/Spotify Similarity Search - Part 1 - Distance Based Search.ipynb
@@ -6,7 +6,7 @@
"source": [
"# Finding Similar Songs on Spotify - Part 1: Distance Based Search\n",
"\n",
- "The first part of this tutorial series demonstrates the traditional way of extracting features from the audio content, training a classifier and predicting results. Because we do not have access to the raw audio content, we cannot extract features ourselves. Fortunately, Spotify is so generious to provide extracted features via their API. Those are just low-level audio features, but they are more than any other streaming music service provide - so Kudos to Spotify for this API! To download the features from the Spotify API you need to apply for a valid client ID. Please follow the steps on the Github page to apply for such an ID.\n",
+ "The first part of this tutorial series demonstrates the traditional way of extracting features from the audio content, training a classifier and predicting results. Because we do not have access to the raw audio content, we cannot extract features ourselves. Fortunately, Spotify is so generous to provide extracted features via their API. Those are just low-level audio features, but they are more than any other streaming music service provide - so Kudos to Spotify for this API! To download the features from the Spotify API you need to apply for a valid client ID. Please follow the steps on the Github page to apply for such an ID.\n",
"\n",
"\n",
"## Part 1 - Overview\n",
@@ -117,7 +117,7 @@
" User authentication requires interaction with your\n",
" web browser. Once you enter your credentials and\n",
" give authorization, you will be redirected to\n",
- " a url. Paste that url you were directed to to\n",
+ " a url. Paste that url you were directed to to\n",
" complete the authorization.\n",
"\n",
" Opened https://accounts.spotify.com/authorize?scope=playlist-modify-public&redirect_uri=ht...\n",
@@ -191,9 +191,9 @@
"source": [
"### Get Playlist meta-data\n",
"\n",
- "Insted of writing one big loop to download the data, I decided to split it into separate more comprehensible steps.\n",
+ "Instead of writing one big loop to download the data, I decided to split it into separate more comprehensible steps.\n",
"\n",
- "The Spotify API does not return infinite elements, but requires batch processing. The largest batch size is 100 items such as tracks, artists or albums. As a first step we get relevant meta-data for the supplied playlists. Especially the *num_track* property is conveniant for the further processing."
+ "The Spotify API does not return infinite elements, but requires batch processing. The largest batch size is 100 items such as tracks, artists or albums. As a first step we get relevant meta-data for the supplied playlists. Especially the *num_track* property is convenient for the further processing."
]
},
{
@@ -375,7 +375,7 @@
"\n",
"We will use caching to locally store retrieved data. This is on the one hand a requirement of the API and on the other it speeds up processing when we reload the notebook. *joblib* is a convenient library which simplifies caching.\n",
"\n",
- "*Update the cachdir to an appropriate path in the following cell*"
+ "*Update the cachedir to an appropriate path in the following cell*"
]
},
{
@@ -399,7 +399,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "The following method retrieves meta-data, sequential features such as *MFCCs* and *Chroma*, and track-level features such as *Dancability*. The *@memory.cache* annotation tells *joblib* to persist all return values for the supplied parameters."
+ "The following method retrieves meta-data, sequential features such as *MFCCs* and *Chroma*, and track-level features such as *Danceability*. The *@memory.cache* annotation tells *joblib* to persist all return values for the supplied parameters."
]
},
{
@@ -749,7 +749,7 @@
"\n",
"### Single Vector Representation\n",
"\n",
- "The simlarity retrieval approach presented in this tutorial is based on a vector-space model where each track is represented of a single fixed-length feature vector. The segment-based features provided by the Spotify API are lists of feature vectors of varying lengths. Thus, these features need to be aggregated into a single feature vector. The following function describes a simple approach to do so:"
+ "The similarity retrieval approach presented in this tutorial is based on a vector-space model where each track is represented of a single fixed-length feature vector. The segment-based features provided by the Spotify API are lists of feature vectors of varying lengths. Thus, these features need to be aggregated into a single feature vector. The following function describes a simple approach to do so:"
]
},
{
@@ -829,7 +829,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "Afgregate all features of the downloaded data"
+ "Aggregate all features of the downloaded data"
]
},
{
@@ -889,7 +889,7 @@
"source": [
"### Normalize feature data\n",
"\n",
- "The feature vectors are composed of differnt feature-sets. All of them with different value ranges. While features such as Acousticness and Danceability are scaled between 0 and 1, the BPM values of the tempo feature ranges around 120 or higher. We apply Standard Score or Zero Mean and Unit Variance normalization to uniformly scale the value ranges of the features.\n",
+ "The feature vectors are composed of different feature-sets. All of them with different value ranges. While features such as Acousticness and Danceability are scaled between 0 and 1, the BPM values of the tempo feature ranges around 120 or higher. We apply Standard Score or Zero Mean and Unit Variance normalization to uniformly scale the value ranges of the features.\n",
"\n",
"$$\n",
"z = {x- \\mu \\over \\sigma}\n",
@@ -922,7 +922,7 @@
" ID Mean Standard Deviation\n",
" 0 1517.5993814237531 291.1855836731788\n",
"\n",
- "In this example the center frequency is 1518 Hz and it deviates by 291 Hz. These numbers already describe the audio content and can be used to find similar tracks. The common approach to calcualte music similarity from audio content is based on vector difference. The assumption is, that similar audio feature-values correspond with similar audio content. Thus, feature vectors with smaller vector differences correspond to more similar tracks. The following data represents the extracted Spectral Centroids of our 10-tracks collection:\n",
+ "In this example the center frequency is 1518 Hz and it deviates by 291 Hz. These numbers already describe the audio content and can be used to find similar tracks. The common approach to calculate music similarity from audio content is based on vector difference. The assumption is, that similar audio feature-values correspond with similar audio content. Thus, feature vectors with smaller vector differences correspond to more similar tracks. The following data represents the extracted Spectral Centroids of our 10-tracks collection:\n",
"\n",
"\n",
" ID Mean Standard Deviation\n",
@@ -994,7 +994,7 @@
"source": [
"### Euclidean Distance\n",
"\n",
- "In the final part of this tutorial we wil use the Euclidean Distance to calculate similarities between tracks. As mentioned above, the Euclidean Distance is a metric to calculate the distance between two vectors and thus is a function of dissimilarity. This means, vectors with smaller distance values are more similar than those with higher distances.\n",
+ "In the final part of this tutorial we will use the Euclidean Distance to calculate similarities between tracks. As mentioned above, the Euclidean Distance is a metric to calculate the distance between two vectors and thus is a function of dissimilarity. This means, vectors with smaller distance values are more similar than those with higher distances.\n",
"\n",
"$$\n",
"d(p,q) = \\sqrt{\\sum_{i=1}^n (q_i-p_i)^2}\n",
@@ -1009,7 +1009,7 @@
},
"outputs": [],
"source": [
- "def eucledian_distance(feature_space, query_vector):\n",
+ "def euclidean_distance(feature_space, query_vector):\n",
" \n",
" return np.sqrt(np.sum((feature_space - query_vector)**2, axis=1))"
]
@@ -1108,7 +1108,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "The following lines of code implement the approach described above. First, the distances between the query vector and all other vectors of the collection are calculated. Then the distances are sorted ascnedingly to get the simlar tracks. Because the metric distance of identical vectors is 0, the top-most entry of the sorted list is always the query track."
+ "The following lines of code implement the approach described above. First, the distances between the query vector and all other vectors of the collection are calculated. Then the distances are sorted ascendingly to get the similar tracks. Because the metric distance of identical vectors is 0, the top-most entry of the sorted list is always the query track."
]
},
{
@@ -1272,7 +1272,7 @@
],
"source": [
"# calculate the distance between the query-vector and all others\n",
- "dist = eucledian_distance(feature_data, feature_data[query_track_idx])\n",
+ "dist = euclidean_distance(feature_data, feature_data[query_track_idx])\n",
"\n",
"# sort the distances ascendingly - use sorted index\n",
"sorted_idx = np.argsort(dist)\n",
@@ -1287,11 +1287,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "### Scaled Eucledian Distance\n",
+ "### Scaled Euclidean Distance\n",
"\n",
- "The approach taken to combine the different feature-sets is refered to as early fusion. The problem with the approach described in the previous step is, that larger feature-sets dominate the calculated distance values. The aggregated MFCC and Chroma features have 24 dimensions each. Together they have more dimensions as the remaining features which are mostly single dimensional features. Thus, the distances are unequally dominated by the two feature sets.\n",
+ "The approach taken to combine the different feature-sets is referred to as early fusion. The problem with the approach described in the previous step is, that larger feature-sets dominate the calculated distance values. The aggregated MFCC and Chroma features have 24 dimensions each. Together they have more dimensions as the remaining features which are mostly single dimensional features. Thus, the distances are unequally dominated by the two feature sets.\n",
"\n",
- "To avoid such a bias, we scale the feature-space such that feature-sets and single-value features have euqal the same weights and thus euqal influence on the resulting distance."
+ "To avoid such a bias, we scale the feature-space such that feature-sets and single-value features have the same weights and thus equal influence on the resulting distance."
]
},
{
@@ -1331,7 +1331,7 @@
},
"outputs": [],
"source": [
- "def scaled_eucledian_distance(feature_space, query_vector):\n",
+ "def scaled_euclidean_distance(feature_space, query_vector):\n",
" \n",
" distances = (feature_space - query_vector)**2\n",
" \n",
@@ -1516,7 +1516,7 @@
}
],
"source": [
- "dist = scaled_eucledian_distance(feature_data, feature_data[query_track_idx])\n",
+ "dist = scaled_euclidean_distance(feature_data, feature_data[query_track_idx])\n",
"\n",
"metadata.loc[np.argsort(dist)[:11], display_cols]"
]
@@ -1527,7 +1527,7 @@
"source": [
"### Feature Weighting\n",
"\n",
- "As explained above, the vanilla Eucliden Distance in an early fusion approach is dominated by large feature-sets. Through scaling the feature-space we achieved equal influence for all feature-sets and features. Now, equal influence is not always the best choice fo music similarity. For example, the year and popularity feature we included into our feature vector are not an intrinsic music property. We just added them to cluster recordings of the same epoch together. Currently this feature has the same impact on the estimated similarity as timbre, rhythm and harmonics. When using many features it is commonly a good choice to apply different weights to them. Estimating these weights is generally achieved empirically."
+ "As explained above, the vanilla Euclidean Distance in an early fusion approach is dominated by large feature-sets. Through scaling the feature-space we achieved equal influence for all feature-sets and features. Now, equal influence is not always the best choice for music similarity. For example, the year and popularity feature we included into our feature vector are not an intrinsic music property. We just added them to cluster recordings of the same epoch together. Currently this feature has the same impact on the estimated similarity as timbre, rhythm and harmonics. When using many features it is commonly a good choice to apply different weights to them. Estimating these weights is generally achieved empirically."
]
},
{
@@ -1567,7 +1567,7 @@
},
"outputs": [],
"source": [
- "def weighted_eucledian_distance(feature_space, query_vector, featureset_weights):\n",
+ "def weighted_euclidean_distance(feature_space, query_vector, featureset_weights):\n",
" \n",
" distances = (feature_space - query_vector)**2\n",
" \n",
@@ -1753,7 +1753,7 @@
}
],
"source": [
- "dist = weighted_eucledian_distance(feature_data, feature_data[query_track_idx], featureset_weights)\n",
+ "dist = weighted_euclidean_distance(feature_data, feature_data[query_track_idx], featureset_weights)\n",
"\n",
"metadata.loc[np.argsort(dist)[:11], display_cols]"
]
@@ -1801,7 +1801,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "Run the evauation for all three introduced algorithms:"
+ "Run the evaluation for all three introduced algorithms:"
]
},
{
@@ -1836,17 +1836,17 @@
" \n",
"
\n",
" \n",
- " | Weighted Eucledian Distance | \n",
+ " Weighted Euclidean Distance | \n",
" 0.583351 | \n",
" 0.34 | \n",
"
\n",
" \n",
- " | Scaled Eucledian Distance | \n",
+ " Scaled Euclidean Distance | \n",
" 0.501596 | \n",
" 0.40 | \n",
"
\n",
" \n",
- " | Eucledian Distance | \n",
+ " Euclidean Distance | \n",
" 0.438723 | \n",
" 0.10 | \n",
"
\n",
@@ -1856,9 +1856,9 @@
],
"text/plain": [
" precision recall\n",
- "Weighted Eucledian Distance 0.583351 0.34\n",
- "Scaled Eucledian Distance 0.501596 0.40\n",
- "Eucledian Distance 0.438723 0.10"
+ "Weighted Euclidean Distance 0.583351 0.34\n",
+ "Scaled Euclidean Distance 0.501596 0.40\n",
+ "Euclidean Distance 0.438723 0.10"
]
},
"execution_count": 74,
@@ -1873,14 +1873,14 @@
"\n",
"# run evaluation\n",
"\n",
- "evaluation_results[\"Eucledian Distance\"] = \\\n",
- " evaluate(lambda x,y: eucledian_distance(x,y), cut_off)\n",
+ "evaluation_results[\"Euclidean Distance\"] = \\\n",
+ " evaluate(lambda x,y: euclidean_distance(x,y), cut_off)\n",
" \n",
- "evaluation_results[\"Scaled Eucledian Distance\"] = \\\n",
- " evaluate(lambda x,y: scaled_eucledian_distance(x,y), cut_off)\n",
+ "evaluation_results[\"Scaled Euclidean Distance\"] = \\\n",
+ " evaluate(lambda x,y: scaled_euclidean_distance(x,y), cut_off)\n",
"\n",
- "evaluation_results[\"Weighted Eucledian Distance\"] = \\\n",
- " evaluate(lambda x,y: weighted_eucledian_distance(x,y, featureset_weights), cut_off)\n",
+ "evaluation_results[\"Weighted Euclidean Distance\"] = \\\n",
+ " evaluate(lambda x,y: weighted_euclidean_distance(x,y, featureset_weights), cut_off)\n",
"\n",
"# aggregate results\n",
"evaluation_results = pd.DataFrame(data = evaluation_results.values(), \n",
@@ -1895,7 +1895,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "These results must be interpreted in relation to the analyzed data-set and the method how the metrics are measured. We measure how many tracks in the resulting list of similar songs belong to the same playlist of the query song. We have chosen genre-related playlists such as *Metal* and *Hip-Hop*. But there are also overalpping playlists such as *Classic Metal* and *Rock Hymns* which both contain Rock and Metal tracks. This should be considered in the interpretation of the evaluation results. To get more reliable results, more efforts need to be put into creating better non-overlapping playlists. But, since music similarity is subject to subjective interpretation, this is a challinging task.\n",
+ "These results must be interpreted in relation to the analyzed data-set and the method how the metrics are measured. We measure how many tracks in the resulting list of similar songs belong to the same playlist of the query song. We have chosen genre-related playlists such as *Metal* and *Hip-Hop*. But there are also overlapping playlists such as *Classic Metal* and *Rock Hymns* which both contain Rock and Metal tracks. This should be considered in the interpretation of the evaluation results. To get more reliable results, more efforts need to be put into creating better non-overlapping playlists. But, since music similarity is subject to subjective interpretation, this is a challenging task.\n",
"\n",
"Although we have a small bias from the overlapping playlists, we see that it makes sense to tune the weights of the features to regulate their impact on the final results. "
]
@@ -1912,9 +1912,9 @@
"\n",
"* **Feature aggregation:** taking only mean and standard deviation is not the most efficient way to aggregate the sequential features provided by the Spotify API.\n",
"* **Distance Measure:** other distance measures could yield better results. This often depends on the underlying dataset.\n",
- "* **Better Machine Learning Methods:** the presented nearest neighobr based approach is a linear model and is not able to model non-linearities of music similarities.\n",
+ "* **Better Machine Learning Methods:** the presented nearest neighbour based approach is a linear model and is not able to model non-linearities of music similarities.\n",
"\n",
- "In the next part of this tutorial series I will introduce Siamese Netowkrs. These Deep Neural Networks are able to learn high-level features from the low-level features as well as to learn the non-linear distance function to estimate the similarity between two tracks."
+ "In the next part of this tutorial series I will introduce Siamese Networks. These Deep Neural Networks are able to learn high-level features from the low-level features as well as to learn the non-linear distance function to estimate the similarity between two tracks."
]
}
],
diff --git a/spotify_similarity_search/Spotify Similarity Search - Part 2 - Siamese Networks.ipynb b/spotify_similarity_search/Spotify Similarity Search - Part 2 - Siamese Networks.ipynb
index 533c2ee..7f85996 100644
--- a/spotify_similarity_search/Spotify Similarity Search - Part 2 - Siamese Networks.ipynb
+++ b/spotify_similarity_search/Spotify Similarity Search - Part 2 - Siamese Networks.ipynb
@@ -332,9 +332,9 @@
"source": [
"# Siamese Networks\n",
"\n",
- "A Siamese neural network is a neural network architecture where two inputs are fed into the same stack of network layers. This is where the name comes from. The shared layers are \"similar\" to Siamese Twins. By feeding two inputs to the shared layers, two representations are generated which can be used for comparison. To train the network according a certain task, it requires labelled data. To learn a simlarity function, these labels should indicate if the two input are similar or dissimilar.\n",
+ "A Siamese neural network is a neural network architecture where two inputs are fed into the same stack of network layers. This is where the name comes from. The shared layers are \"similar\" to Siamese Twins. By feeding two inputs to the shared layers, two representations are generated which can be used for comparison. To train the network according a certain task, it requires labelled data. To learn a similarity function, these labels should indicate if the two input are similar or dissimilar.\n",
"\n",
- "This is exactly the approach initially described by Hadsell-et-al.'06 (http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf). The authors create pairs of simlar and dissimilar images. These are fed into a Siamese NEtwork stack. Finally, the model calculates the eucledian distance between the two generated representations. A contrastive loss is used, to optimize the learned simlarity.\n",
+ "This is exactly the approach initially described by Hadsell-et-al.'06 (http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf). The authors create pairs of simlar and dissimilar images. These are fed into a Siamese Network stack. Finally, the model calculates the euclidean distance between the two generated representations. A contrastive loss is used, to optimize the learned simlarity.\n",
"\n",
"To calculate the similarity between a seed image and the rest of the collection, the model is applied to predict the distance between this seed image and every other. The result is a list of distances which has to be sorted descendingly.\n",
"\n",
@@ -349,7 +349,7 @@
"\n",
"We use the high-level deep learning API Keras. [TODO: link]\n",
"\n",
- "[TODO: describe - auf Tom's Tutorial verweisen für instructoins]"
+ "[TODO: describe - auf Tom's Tutorial verweisen für instructions]"
]
},
{
@@ -372,7 +372,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "First we define a distance measure to compare the two representations. We will be using the well known Eucledian distance:"
+ "First we define a distance measure to compare the two representations. We will be using the well known Euclidean distance:"
]
},
{
@@ -394,7 +394,7 @@
"source": [
"### The Siamese Network Architecture\n",
"\n",
- "Now we define the Siamese Network Architecture. It consists of two fully connected layers. These layers are shared among the \"Siamese twins\". The network takes two inputs. One goes to the left twin, the other to the right one. The Eucledian distance of the output of each twin is calculated which is the final output of the model."
+ "Now we define the Siamese Network Architecture. It consists of two fully connected layers. These layers are shared among the \"Siamese twins\". The network takes two inputs. One goes to the left twin, the other to the right one. The Euclidean distance of the output of each twin is calculated which is the final output of the model."
]
},
{
@@ -589,7 +589,7 @@
"source": [
"# Evaluate\n",
"\n",
- "Now that we have a trained model, we want to evaluate its performance. We will first play around with some examples, listen to the results and judge by our subjective interpretation before we persue a general evaluation."
+ "Now that we have a trained model, we want to evaluate its performance. We will first play around with some examples, listen to the results and judge by our subjective interpretation before we pursue a general evaluation."
]
},
{
@@ -600,7 +600,7 @@
"source": [
"### Evaluate by Example\n",
"\n",
- "The following function calculated the distances between a given query track and all other tracks of the collection. The result is a list of distances where the smallest distance coresponds with the most similar track. The list is sorted descendingly and the top-ten similar tracks are presented below the information of the query track. The Spotify playlist we created at the beginning will also be updated with the query results. Thus, you can listen to it in your Spotify client."
+ "The following function calculated the distances between a given query track and all other tracks of the collection. The result is a list of distances where the smallest distance corresponds with the most similar track. The list is sorted descendingly and the top-ten similar tracks are presented below the information of the query track. The Spotify playlist we created at the beginning will also be updated with the query results. Thus, you can listen to it in your Spotify client."
]
},
{
@@ -885,7 +885,7 @@
"\n",
"More than 90% precision is quite an exciting number, but several flaws of the experimental design have to be considered. The model is evaluated according its ability rank tracks higher which belong to the same playlist of the query track. This does not imply music similarity in general. The set of playlists used in this tutorial contains some very broad lists which span over several music genres. And even songs of the same genre may sound completely different.\n",
"\n",
- "The single exploratory examples show, that, althoug from the same playlist, many ranked results do not really fit."
+ "The single exploratory examples show, that, although from the same playlist, many ranked results do not really fit."
]
},
{
@@ -1271,7 +1271,7 @@
"source": [
"### Train network with prior knowledge\n",
"\n",
- "With this lookup-table we can create more accurate input pairs. Insted of similar/dissimilar we can now apply the supplied similarites:"
+ "With this lookup-table we can create more accurate input pairs. Instead of similar/dissimilar we can now apply the supplied similarities:"
]
},
{
@@ -1629,7 +1629,7 @@
"source": [
"# Improve Performance through Identity\n",
"\n",
- "So far we have taught the network what is similar and what not, but we have not shown it, what is identical. All input pairs created so far missed to pass identical data. In the following step, we will include identical pairs into the training instances. To emphasis the identity, only identical pairs will be assigned a label of 1. All other similarity values of the lookup-table will be decreased by 0.1. Thus, tracks of the same playlist will have a similarity value 0f 0.9."
+ "So far we have taught the network what is similar and what not, but we have not shown it, what is identical. All input pairs created so far missed to pass identical data. In the following step, we will include identical pairs into the training instances. To emphasis the identity, only identical pairs will be assigned a label of 1. All other similarity values of the lookup-table will be decreased by 0.1. Thus, tracks of the same playlist will have a similarity value of 0.9."
]
},
{
@@ -1974,7 +1974,7 @@
"\n",
"### Preparing input-data for the LSTM\n",
"\n",
- "Now it gets a little more complicated. LSTMs do not take a single input vector as input, but sequences of vectors, also referred to as timesteps. In this tutorial a timestep corresponds with a segment provided by the Spotify API. Such a segment contains the same features used before, but on a smaller temporal scale. We will again caoncatenate those features into a single vector and use several consecutive segments as input for the LSTM.\n",
+ "Now it gets a little more complicated. LSTMs do not take a single input vector as input, but sequences of vectors, also referred to as timesteps. In this tutorial a timestep corresponds with a segment provided by the Spotify API. Such a segment contains the same features used before, but on a smaller temporal scale. We will again concatenate those features into a single vector and use several consecutive segments as input for the LSTM.\n",
"\n",
"Unfortunately, du to the onset detection used by Spotify to segment the audio data, these segments do not have the same length. The following chart shows the distribution of the segment length's:"
]
@@ -2264,7 +2264,7 @@
"source": [
"### The Siamese Network Architecture using LSTMs\n",
"\n",
- "The network architecture consists again of shared \"twin\" layers. But, now two layers are shared. First the fully connected layers which train on the track-based features. Second, a bi-directional LSTM which trains on the re-scaled sequential data provided by the Spotify API. The network thus now takes four inputs. The outputs of the fully connected layer and the LSTM are finally joined and the Eucledian distance of each twin is calculated which is the final output of the model."
+ "The network architecture consists again of shared \"twin\" layers. But, now two layers are shared. First the fully connected layers which train on the track-based features. Second, a bi-directional LSTM which trains on the re-scaled sequential data provided by the Spotify API. The network thus now takes four inputs. The outputs of the fully connected layer and the LSTM are finally joined and the Euclidean distance of each twin is calculated which is the final output of the model."
]
},
{
diff --git a/spotify_similarity_search/Spotify Similarity Search - Part 3 - Siamese Networks with Tag Similarity.ipynb b/spotify_similarity_search/Spotify Similarity Search - Part 3 - Siamese Networks with Tag Similarity.ipynb
index 9701943..38aab06 100644
--- a/spotify_similarity_search/Spotify Similarity Search - Part 3 - Siamese Networks with Tag Similarity.ipynb
+++ b/spotify_similarity_search/Spotify Similarity Search - Part 3 - Siamese Networks with Tag Similarity.ipynb
@@ -464,7 +464,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "As we can see, the \"rock\" genre is the most dominating one. Thus, providing this label to a track does not add more relevant information to it to distinguish it from other tracks. The term \"romantic\" on the other hand is less frequent which makes it more discrimitative.\n",
+ "As we can see, the \"rock\" genre is the most dominating one. Thus, providing this label to a track does not add more relevant information to it to distinguish it from other tracks. The term \"romantic\" on the other hand is less frequent which makes it more discriminative.\n",
"\n",
"[TODO: explain TF/IDF]"
]
@@ -488,7 +488,7 @@
"collapsed": true
},
"source": [
- "Based on this rescaled label weights, we can calculate the tag-based similarties between all pairs of tracks. We will use the Dice metric\n",
+ "Based on this rescaled label weights, we can calculate the tag-based similarities between all pairs of tracks. We will use the Dice metric\n",
"\n",
"[TODO: explain DICE]"
]
@@ -933,7 +933,7 @@
"source": [
"### Evaluate by Example\n",
"\n",
- "The following function calculated the distances between a given query track and all other tracks of the collection. The result is a list of distances where the smallest distance coresponds with the most similar track. The list is sorted descendingly and the top-ten similar tracks are presented below the information of the query track. The Spotify playlist we created at the beginning will also be updated with the query results. Thus, you can listen to it in your Spotify client."
+ "The following function calculated the distances between a given query track and all other tracks of the collection. The result is a list of distances where the smallest distance corresponds with the most similar track. The list is sorted descendingly and the top-ten similar tracks are presented below the information of the query track. The Spotify playlist we created at the beginning will also be updated with the query results. Thus, you can listen to it in your Spotify client."
]
},
{
@@ -1394,7 +1394,7 @@
"source": [
"### The Siamese Network Architecture using LSTMs\n",
"\n",
- "The network architecture consists again of shared \"twin\" layers. But, now two layers are shared. First the fully connected layers which train on the track-based features. Second, a bi-directional LSTM which trains on the re-scaled sequential data provided by the Spotify API. The network thus now takes four inputs. The outputs of the fully connected layer and the LSTM are finally joined and the Eucledian distance of each twin is calculated which is the final output of the model."
+ "The network architecture consists again of shared \"twin\" layers. But, now two layers are shared. First the fully connected layers which train on the track-based features. Second, a bi-directional LSTM which trains on the re-scaled sequential data provided by the Spotify API. The network thus now takes four inputs. The outputs of the fully connected layer and the LSTM are finally joined and the Euclidean distance of each twin is calculated which is the final output of the model."
]
},
{
diff --git a/spotify_similarity_search/Spotify Similarity Search - original.ipynb b/spotify_similarity_search/Spotify Similarity Search - original.ipynb
index cf0b719..7721e99 100644
--- a/spotify_similarity_search/Spotify Similarity Search - original.ipynb
+++ b/spotify_similarity_search/Spotify Similarity Search - original.ipynb
@@ -184,9 +184,9 @@
"source": [
"### Get Playlist meta-data\n",
"\n",
- "Insted of writing one big loop to download the data, I decided to split it into separate more comprehensible steps.\n",
+ "Instead of writing one big loop to download the data, I decided to split it into separate more comprehensible steps.\n",
"\n",
- "The Spotify API does not return infinite elements, but requires batch processing. The largest batch size is 100 items such as tracks, artists or albums. As a first step we get relevant meta-data for the supplied playlists. Especially the *num_track* property is conveniant for the further processing."
+ "The Spotify API does not return infinite elements, but requires batch processing. The largest batch size is 100 items such as tracks, artists or albums. As a first step we get relevant meta-data for the supplied playlists. Especially the *num_track* property is convenient for the further processing."
]
},
{
@@ -297,7 +297,7 @@
"\n",
"We will use caching to locally store retrieved data. This is on the one hand a requirement of the API and on the other it speeds up processing when we reload the notebook. *joblib* is a convenient library which simplifies caching.\n",
"\n",
- "*Update the cachdir to an appropriate path in the following cell*"
+ "*Update the cachedir to an appropriate path in the following cell*"
]
},
{
@@ -321,7 +321,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "The following method retrieves meta-data, sequential features such as *MFCCs* and *Chroma*, and track-level features such as *Dancability*. The *@memory.cache* annotation tells *joblib* to persist all return values for the supplied parameters."
+ "The following method retrieves meta-data, sequential features such as *MFCCs* and *Chroma*, and track-level features such as *Danceability*. The *@memory.cache* annotation tells *joblib* to persist all return values for the supplied parameters."
]
},
{
diff --git a/spotify_similarity_search/tutorial_functions.py b/spotify_similarity_search/tutorial_functions.py
index f989f97..fbea4fc 100644
--- a/spotify_similarity_search/tutorial_functions.py
+++ b/spotify_similarity_search/tutorial_functions.py
@@ -39,7 +39,7 @@ def get_playlist_metadata(spotify_client, playlists):
# initialize fields for further processing
playlist["track_ids"] = []
-
+
return playlists
@@ -57,13 +57,13 @@ def get_track_ids(sp, playlists):
limit = np.min([batch_size, playlist["num_tracks"] - offset])
playlist_entries = sp.user_playlist_tracks(user = playlist["user"],
- playlist_id = playlist["playlist_id"],
- limit = limit,
+ playlist_id = playlist["playlist_id"],
+ limit = limit,
offset = offset,
fields = ["items"])
playlist["track_ids"].extend([entry["track"]["id"] for entry in playlist_entries["items"]])
-
+
return playlists
@@ -83,62 +83,62 @@ def aggregate_metadata(raw_track_data):
# assamble metadata
metadata.append([track_metadata["id"],
- artist_metadata["name"],
- track_metadata["name"],
+ artist_metadata["name"],
+ track_metadata["name"],
album_metadata["name"],
album_metadata["label"],
track_metadata["duration_ms"],
track_metadata["popularity"],
release_date,
- artist_metadata["genres"],
+ artist_metadata["genres"],
playlist_name])
- metadata = pd.DataFrame(metadata, columns=["track_id", "artist_name", "title", "album_name", "label",
+ metadata = pd.DataFrame(metadata, columns=["track_id", "artist_name", "title", "album_name", "label",
"duration", "popularity", "year", "genres", "playlist"])
-
+
return metadata
def aggregate_features(seq_data, track_data, metadata, with_year=False, with_popularity=False):
calc_statistical_moments = lambda x: np.concatenate([x.mean(axis=0), x.std(axis=0)])
-
+
# sequential data
segments = seq_data["segments"]
sl = len(segments)
-
+
# MFCCs - 24 dimensions
mfcc = np.array([s["timbre"] for s in segments])
mfcc = calc_statistical_moments(mfcc)
-
+
# Chroma / pitch classes - 24 dimensions
chroma = np.array([s["pitches"] for s in segments])
chroma = calc_statistical_moments(chroma)
-
+
# maximum loudness values per segment - 2 dimensions
loudness_max = np.array([s["loudness_max"] for s in segments]).reshape((sl,1))
loudness_max = calc_statistical_moments(loudness_max)
-
+
# offset of max loudness value within segment - 2 dimensions
loudness_start = np.array([s["loudness_start"] for s in segments]).reshape((sl,1))
loudness_start = calc_statistical_moments(loudness_start)
-
+
# length of max loudness values within segment - 2 dimensions
loudness_max_time = np.array([s["loudness_max_time"] for s in segments]).reshape((sl,1))
loudness_max_time = calc_statistical_moments(loudness_max_time)
-
+
# length of segment - 2 dimensions
duration = np.array([s["duration"] for s in segments]).reshape((sl,1))
duration = calc_statistical_moments(duration)
-
+
# confidence of segment boundary detection - 2 dimensions
confidence = np.array([s["confidence"] for s in segments]).reshape((sl,1))
confidence = calc_statistical_moments(confidence)
-
+
# concatenate sequential features
- sequential_features = np.concatenate([mfcc, chroma, loudness_max, loudness_start,
+ sequential_features = np.concatenate([mfcc, chroma, loudness_max, loudness_start,
loudness_max_time, duration, confidence], axis=0)
-
+
# track-based data
track_features = [track_data[0]["acousticness"], # acoustic or not?
track_data[0]["danceability"], # danceable?
@@ -149,14 +149,14 @@ def aggregate_features(seq_data, track_data, metadata, with_year=False, with_pop
track_data[0]["tempo"], # slow or fast?
track_data[0]["time_signature"], # 3/4, 4/4, 6/8, etc.
track_data[0]["valence"]] # happy or sad?
-
+
if with_year:
track_features.append(int(metadata["year"]))
-
+
if with_popularity:
track_features.append(int(metadata["popularity"]))
-
-
+
+
return np.concatenate([sequential_features, track_features], axis=0)
@@ -168,19 +168,19 @@ def aggregate_featuredata(raw_track_data, metadata):
_, _, _, f_sequential, f_trackbased = spotify_data
- feature_vec = aggregate_features(f_sequential,
- f_trackbased,
- metadata.iloc[i],
- with_year = True,
- with_popularity = True)
+ feature_vec = aggregate_features(f_sequential,
+ f_trackbased,
+ metadata.iloc[i],
+ with_year = True,
+ with_popularity = True)
feature_data.append(feature_vec)
feature_data = np.asarray(feature_data)
-
+
return feature_data
-
-
+
+
# updatable plot
# a minimal example (sort of)
@@ -190,31 +190,31 @@ def on_train_begin(self, logs={}):
self.x = []
self.losses = []
self.val_losses = []
-
+
self.fig = plt.figure()
-
+
self.logs = []
def on_epoch_end(self, epoch, logs={}):
-
+
self.logs.append(logs)
self.x.append(self.i)
self.losses.append(logs.get('loss'))
self.val_losses.append(logs.get('val_loss'))
self.i += 1
-
+
clear_output(wait=True)
plt.plot(self.x, self.losses, label="loss")
plt.plot(self.x, self.val_losses, label="val_loss")
plt.legend()
plt.show();
-
+
def contrastive_loss(y_true, y_pred):
margin = 1
return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))
-
+
def euclidean_distance(vects):
x, y = vects
return K.sqrt(K.maximum(K.sum(K.square(x - y), axis=1, keepdims=True), K.epsilon()))
-
+