音乐雷达 shazam算法_具有10亿首Shazam音乐识别功能的数据可视化

音乐雷达 shazam算法

While at university, I joined Shazam as part time web developer. I stayed at Shazam for 5 enjoyable years. This post is about one hackday project I worked on. The project involves plotting one billion Shazam recognitions onto a blank canvas, and then observing the result.

在大学期间,我以兼职Web开发人员的身份加入Shazam。 我在Shazam住了5年愉快的时光。 这篇文章是关于我从事的hackday项目。 该项目涉及在一块空白画布上绘制十亿个Shazam识别,然后观察结果。

This post also touches upon the process I used to create the visuals.

这篇文章还涉及了我用来创建视觉效果的过程。

什么是“ Shazam识别” (What is a ‘Shazam recognition’)

Think of a Shazam recognition like this. You open up Shazam, the mobile app, and have it ‘listen’ for a piece of music that’s playing in the background. A recognition is the successful identification of the song.

想像这样的Shazam识别。 您打开移动应用Shazam,然后让它“聆听”在后台播放的音乐。 识别是对歌曲的成功识别。

位置资料 (Location data)

A user may opt-in to sharing their location data with Shazam. Shazam then makes some of the anonymised location data (latitude and longitude) available to employees, depending on their use case.

用户可以选择与Shazam共享他们的位置数据。 然后,Shazam会根据员工的用例向员工提供一些 匿名的位置数据(纬度和经度)。

Having anonymised location data to visualise was a cool experience. It taught me a lot about processing large datasets, visualisations which tell a story, and visualisations which look pretty but don’t do anything else.

具有匿名的位置数据以可视化是一种很酷的体验。 它教会了我很多有关处理大型数据集,讲述故事的可视化以及看起来漂亮但无所事事的可视化的很多知识。

可视化 (The visualisation)

One thing you need to know, all visualisations follow this idea: One dot represents one successful recognition. Dots are plotted onto a geographical coordinate system. This is not the same as taking a Google Map and then plotting location markers over it.

您需要知道的一件事是,所有可视化都遵循这一思想:一个点代表一种成功的识别。 将点绘制到地理坐标系上。 这与拍摄Google地图然后在其上绘制位置标记不同。

I have used colour to differentiate between Android and iOS. Can you guess which is which? Hint: Look at the major cities. Which device type do you think is prevalent there?

我使用颜色来区分Android和iOS。 你能猜出是哪个吗? 提示:看主要城市。 您认为那里流行哪种设备?

  • Android: Red

    Android :红色

  • iOS: Blue

    iOS :蓝色

If you look closely the the dot maps, you can notice clear definitions for the roads. This can be explained by passengers who are Shazam’ing music playing from car speakers.

如果仔细观察点图,您会注意到道路的清晰定义。 Shazam正在通过汽车扬声器播放音乐的乘客可以对此进行解释。

I also made maps with alternative colour schemes.

我还使用其他配色方案制作了地图。

互动地图 (Interactive Maps)

I thought it would be fun to visualise the map interactively. In the same way you would drag/zoom on a Google Map, what if you could also drag/zoom a Shazam map? This element of interactivity is what encourages people to use, explore and learn from the maps. Rather than just being something static that you never revisit.

我认为以交互方式可视化地图会很有趣。 用与您在Google地图上拖动/缩放相同的方式,如果您也可以拖动/缩放Shazam地图怎么办? 互动的要素是鼓励人们使用,探索和学习地图的要素。 不仅仅是成为您永远不会重新访问的静态内容。

To do this, I needed to generate millions of map ‘tiles’. For example, here are some tiles of London, taken from Google Maps.

为此,我需要生成数百万个地图“平铺”。 例如,这是伦敦的一些地砖,取自Google地图。

Each tile is a separate image. Take note of the different zoom levels. As you might guess, when you drag and zoom on a Google Map, it presents many different images to you, the images are referred to as map tiles.

每个图块都是单独的图像。 注意不同的缩放级别。 您可能会猜到,当您在Google地图上拖动和缩放时,它会为您呈现许多不同的图像,这些图像称为地图图块。

Here are the tiles of the Shazam Map.

这是Shazam地图的图块。

In total, I created over 40GB worth of tile data. This is because of the zoom level I had specified. A high zoom level means those viewing the map are able to zoom into a greater level.

总共我创建了超过40GB的切片数据。 这是由于我指定的缩放级别。 高缩放级别意味着查看地图的用户可以缩放到更大级别。

Upon reviewing the visualisations with colleagues, we kept wondering: What “place” was in the location of large clusters. For example, was it a music venue where people would frequently be using Shazam?

在与同事一起查看可视化效果时,我们一直在想:大型集群的位置中有什么“位置”。 例如,这是人们经常使用Shazam的音乐场所吗?

To help answer this question, I had an idea: What if I used a location service to determine what places are currently present. To do this, I used the Google Maps Places API. Every time you scroll to a new location, I query Google Maps API to ask the question: What places are within this location?

为了帮助回答这个问题,我有一个主意:如果我使用定位服务来确定当前存在哪些位置,该怎么办。 为此,我使用了Google Maps Places API 。 每次您滚动到新位置时,我都会查询Google Maps API,并询问以下问题:该位置位于哪些位置?

When using this feature, we began to realise that clusters of dots would typically be the result of: cafes, night clubs, shopping centers, convenience stores and others.

当使用此功能时,我们开始意识到点的簇通常是以下原因造成的:咖啡馆,夜总会,购物中心,便利店等。

I also synced a Mapbox map (similar to Google Maps) so as you drag and zoom into the Shazam map, the other ‘regular’ map would move around also. This allows you to quickly identify what geographic location you are currently looking at

我还同步了一个Mapbox地图(类似于Google Maps),因此当您拖动并放大Shazam地图时,另一个“常规”地图也会移动。 这使您可以快速确定当前正在查看的地理位置

代码 (The code)

Like with everything I do, I’m only benefiting from hard work done by others in our community. All credit goes to Eric Fischer for their excellent work on datamaps. If you follow the instructions on that Github repository, you’ll be able to make your own visualisations. You’ll need a dataset consisting of longitude and latitude points, you might find something on Github, for example, awesome-public-datasets.

就像我所做的一切一样,我只能从社区中其他人的辛勤工作中受益。 一切归功于埃里克·菲舍尔为他们的出色工作的资料地图 。 如果您按照该Github存储库中的说明进行操作,则可以进行自己的可视化处理。 您需要一个由经度和纬度点组成的数据集,您可能会在Github上找到一些东西,例如awesome-public-datasets 。

If you end up trying it out: here are a few notes I made for myself which you might find useful.

如果您最终尝试一下:这是我为我自己做的一些笔记,您可能会觉得有用。

First, you need a big long list of latitudes and longitudes. However to even get hold of that data, you might have to do extra work. In my case, I got it from an internal Shazam API. I used a Node module called fast-csv to parse data. Using streams in this fashion makes parsing large data (gigabytes worth) simple to do.

首先,您需要一长串的纬度和经度。 但是,即使要掌握这些数据,您可能也需要做一些额外的工作。 就我而言,我是从内部Shazam API获得的。 我使用了一个名为fast-csv的Node模块来解析数据。 以这种方式使用流使解析大数据(千兆字节)变得容易。

csv.fromStream(stream,{headers : true}).on(‘data’, handleRecord);

The handleRecord function does this:

handleRecord函数执行此操作:

function handleRecord(record) {   const location = tag.location.latitude + ‘,’ + tag.location.longitude;   console.log(location);}

The output looks something like:

输出看起来像:

lat,lon
-22.1028,166.1833
29.8075,-95.4113
51.2168,-0.8045
27.3007,-82.5221
20.5743,-100.3793
-36.0451,146.9267
26.7554,-81.4237

At this point, you can begin to plug it into data maps (there are detailed instructions within the project documentation).

此时,您可以开始将其插入数据映射中(项目文档中有详细的说明)。

Following the documentation long enough, I arrived at a point where I could create the final image. To create a datamap of London, specify the bounding box as location coordinates that you wish to capture:

在足够长时间阅读文档之后,我到达了可以创建最终图像的位置。 要创建伦敦的数据地图,请将边界框指定为要捕获的位置坐标:

./render -A -- output 14 51.641353 -0.447693 51.333508 0.260925 > london.png

Because I created the same static maps so often (when experimenting with colour for example), I decided to script the whole process. Being a web developer, I did this all in Node.js, however a simple Bash script would have been fine. First, I made an object containing all the maps I wanted to render.

因为我经常创建相同的静态贴图(例如在尝试使用颜色时),所以我决定编写整个过程的脚本。 作为一名Web开发人员,我在Node.js中完成了所有这些工作,但是简单的Bash脚本就可以了。 首先,我制作了一个包含所有要渲染的地图的对象。

Then it was a case of constructing the command you saw earlier, but for each location entry in that JSON block you see in the image above.

然后就是构造您先前看到的命令的情况,但是对于该JSON块中的每个位置条目,您都可以在上图中看到。

呈现 (Presenting)

At Shazam, there were multiple hack days. Then after a few months, was a demo day. You presented your hack day work on the demo day. Showing folks this particular project was well received.

在Shazam,有很多黑客日子。 然后几个月后,是一个演示日。 您在演示日介绍了黑客日的工作。 向人们展示了这个特别的项目,受到了好评。

To those developers creating command-line applications or going on code refactoring adventures during hack days, consider that a demo day audience may prefer more visual demos, rather than technical (this has been my experience). One way around this is: blog about what you’ve done and share the resources after, skipping a live demo entirely. Or even better, figure out how to distill technical concepts to a non-technical audience, introduce more visuals, and continue to give your demo to a live audience. It’s harder, but more rewarding.

对于那些在黑客日期间创建命令行应用程序或进行代码重构冒险的开发人员,请考虑一个演示日的观众可能更喜欢可视化的演示,而不是技术性的(这是我的经验)。 解决此问题的一种方法是:写博客介绍您的工作并分享之后的资源,而完全跳过实时演示。 甚至更好的是,找出如何将技术概念提炼给非技术受众,引入更多视觉效果并继续向现场观众进行演示。 这比较难,但是更有意义。

数据图的高分辨率图像 (High resolution images of the data maps)

Note: You can zoom into these images with the Google Photos interface

注意:您可以使用Google相册界面放大这些图像

  • World — Notice which countries/cities have high iOS usage

    世界 -注意哪些国家/城市的iOS使用率很高

  • United Kingdom — Notice the cities

    英国 -注意城市

  • Toronto

    多伦多

  • San Francisco

    旧金山

  • Paris

    巴黎

结论 (Conclusion)

I’m grateful to Shazam for encouraging us to learn new skills and technologies. Also thanks to Eric Fischer for developing the datamaps project in the first place! If you have access to location data, consider the many interesting ways of visualising it. You could also try using Tweets from the Twitter API, just make sure they have location data attached to them.

我感谢Shazam鼓励我们学习新的技能和技术。 另外还要感谢Eric Fischer首先开发了datamaps项​​目! 如果您有权访问位置数据,请考虑多种有趣的可视化方法。 您还可以尝试使用Twitter API中的Tweets,只需确保它们具有附加的位置数据即可。

想要看到更多这样的东西吗? (Want to see more like this?)

Follow me on Twitter: @umaar and let me know! I try & tweet out lots of web development resources.

在Twitter上关注我: @umaar ,让我知道! 我尝试并发布了很多Web开发资源。

Please like and share if you enjoyed reading my article and leave a comment with your experiences in data visualisation.

如果您喜欢阅读我的文章,并喜欢在数据可视化方面发表评论,请喜欢并分享。

翻译自: https://www.freecodecamp.org/news/data-visualisation-with-1-billion-shazam-music-recognitions-90728df3a8c9/

音乐雷达 shazam算法

Published by

风君子

独自遨游何稽首 揭天掀地慰生平

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注