You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A small command-line tool to find similar audio files
23
24
24
25
Installation
25
26
============
26
27
27
-
First, install the Chromaprint_ fingerprinting library by Lukáš Lalinský. (The library
28
-
itself depends on an FFT library, but it's smart enough to use an algorithm from
29
-
software you probably already have installed; see the Chromaprint page for details)
28
+
First, install the Chromaprint_ fingerprinting library by Lukáš Lalinský. (The library itself depends on an FFT library, but it's smart enough to use an algorithm from software you probably already have installed; see the Chromaprint page for details.)
30
29
31
30
Then you can install this library:
32
31
33
32
.. code-block:: bash
34
33
35
34
pip install audiomatch
36
35
37
-
To do things fast *audiomatch* requires C compiler and Python headers to be installed.
38
-
You can skip compilation by setting ``AUDIOMATCH_NO_EXTENSIONS`` environment variable:
36
+
To perform tasks quickly, *audiomatch* requires a C compiler and Python headers to be installed. You can skip the compilation by setting the ``AUDIOMATCH_NO_EXTENSIONS`` environment variable:
39
37
40
38
.. code-block:: bash
41
39
42
40
AUDIOMATCH_NO_EXTENSIONS=1 pip install audiomatch
43
41
44
-
You can avoid installing all this libraries on your computer and run everything in
45
-
docker:
42
+
You can avoid installing all these libraries on your computer and run everything in Docker:
46
43
47
44
.. code-block:: bash
48
45
@@ -51,7 +48,7 @@ docker:
51
48
Quickstart
52
49
==========
53
50
54
-
Suppose, we have a directory with Nirvana songs:
51
+
Suppose we have a directory with Nirvana songs:
55
52
56
53
.. code-block:: bash
57
54
@@ -82,62 +79,47 @@ Let's find out which files sound similar:
82
79
./demo/Pennyroyal Tea (Solo Acoustic).mp3
83
80
./demo/Pennyroyal Tea (Unplugged in NYC).m4a
84
81
85
-
*Note #1: input audio files should be at least 10 seconds long*
82
+
*Note #1: Input audio files should be at least 10 seconds long.*
86
83
87
-
*Note #2: in some rare cases false positives are possible*
84
+
*Note #2: In some rare cases, false positives are possible.*
88
85
89
-
What's happening here is that *audiomatch* takes all audio files from the directory and
90
-
compares them with each other.
86
+
What's happening here is that *audiomatch* takes all audio files from the directory and compares them with each other.
91
87
92
-
You can also compare file with another file, file and directory, or directory to
93
-
directory. If you need to, you can provide glob-style patterns, but don't forget to
94
-
quote it, because otherwise shell expanded it for you. For example, let's compare all
95
-
``.mp3`` files with ``.m4a`` files:
88
+
You can also compare a file with another file, a file and a directory, or a directory with another directory. If you need to, you can provide glob-style patterns, but don't forget to quote them, because otherwise the shell will expand it for you. For example, let's compare all ``.mp3`` files with ``.m4a`` files:
96
89
97
90
.. code-block:: bash
98
91
99
-
$ audiomatch "./demo/*.mp3""./demo/*.m4a"
92
+
$ audiomatch "./demo/*.mp3""./demo/*.m4a"
100
93
These files sound similar:
101
94
102
95
../demo/Pennyroyal Tea (Solo Acoustic).mp3
103
96
../demo/Pennyroyal Tea (Unplugged in NYC).m4a
104
97
105
-
This time, *audiomatch* took all files with ``.mp3`` extension and compare them with
106
-
all files with ``.m4a`` extension.
98
+
This time, *audiomatch* took all files with the ``.mp3`` extension and compared them with all files with the ``.m4a`` extension.
107
99
108
-
Note, how there is no In Utero version in the output. The reason it is present in the
109
-
previous output, because it actually similar with Unplugged version and then transitive
110
-
law applies: if ``a = b`` and ``b = c``, then ``a = c``.
100
+
Note how there is no In Utero version in the output. The reason it is present in the previous output is because it is actually similar to the Unplugged version, and then the transitive law applies: if ``a = b`` and ``b = c``, then ``a = c``.
111
101
112
102
--length
113
103
--------
114
104
115
-
The ``--length`` specifies how many seconds to take for analysis from the song. Default
116
-
value is 120 and it is good enough to find exactly the same song, but maybe in different
117
-
quality. However, for a more complicated cases like same song played in different tempo
118
-
the more input we have the more accurate results are.
105
+
The ``--length`` option specifies how many seconds to take for analysis from the song. The default value is 120, and it is good enough to find exactly the same song, but maybe in different quality. However, for more complicated cases like the same song played in a different tempo, the more input we have, the more accurate results are.
119
106
120
107
--extension
121
108
-----------
122
109
123
-
By default, ``audiomatch`` looks for files with ``.m4a``, ``mp3``, ``.caf`` extensions.
124
-
In theory, audio formats supported by ffmpeg_ also supported by *audiomatch*. You can
125
-
tell to *audiomatch* to look for a specific format by using ``--extension`` flag:
110
+
By default, ``audiomatch`` looks for files with ``.m4a``, ``.mp3``, ``.caf`` extensions. In theory, audio formats supported by ffmpeg_ are also supported by *audiomatch*. You can tell *audiomatch* to look for a specific format by using the ``--extension`` flag:
126
111
127
112
.. code-block:: bash
128
113
129
114
$ audiomatch -e .ogg -e .wav ./demo
130
115
Not enough input files.
131
116
132
-
Indeed, we tried to compare files with ``.ogg`` and ``.wav`` extension, but there are
133
-
no such files in the demo directory.
117
+
Indeed, we tried to compare files with ``.ogg`` and ``.wav`` extensions, but there are no such files in the demo directory.
134
118
135
119
Motivation
136
120
==========
137
121
138
-
I play guitar and do recordings from time to time mainly with Voice Memos on iPhone.
139
-
Over the years, I have hundreds of recordings like that and I though it would be cool
140
-
to find all the similar ones and see how I progress over the years.
122
+
I play guitar and do recordings from time to time, mainly with Voice Memos on iPhone. Over the years, I have hundreds of recordings like that, and I thought it would be cool to find all the similar ones and see how I have progressed over the years.
0 commit comments