Let's leverage!

Jetzt geht's ans Eingemachte,

Butter bei die Fischs!

Wir arbeiten uns nach dieser Anleitung weiter vor:

https://www.analyticsvidhya.com/blog/2017/08/audio-voice-processing-deep-learning/

i = random.choice(train.index)

audio_name = train.ID[i]
print(audio_name)

# path = os.path.join(data_dir, 'Train', str(audio_name) + '.wav')

print('Class: ', train.Class[i])
x, sr = librosa.load('/home/heide/python-wave-analysis/Train/' + str(train.ID[i]) + '.wav')

plt.figure(figsize=(12, 4))
librosa.display.waveplot(x, sr=sr)

Zufällig ausgelesene Wave-Datei, wav-Format, 16bit, pcm.

Die Anleitung gibt nun vor, aus dem train.csv-File die Anzahlen der verschiedenen, vorgegebenen Kategorien zu ermitteln.

train.Class.value_counts()

jackhammer          668
engine_idling       624
siren               607
drilling            600
dog_bark            600
street_music        600
children_playing    600
air_conditioner     600
car_horn            306
gun_shot            230
Class                 1
Name: Class, dtype: int64

Und dann erfolgt dieser Vorschlag:

We see that jackhammer class has more values than any other class. So let us create our first submission with this idea.

test = pd.read_csv('/home/heide/python-wave-analysis/test.csv')
test['Class'] = 'jackhammer'
test.to_csv('/home/heide/python-wave-analysis/sub01.csv', index=False)

Heraus kommt dabei das:

ID,Class
5,jackhammer
7,jackhammer
8,jackhammer
9,jackhammer
13,jackhammer
14,jackhammer
16,jackhammer
21,jackhammer
23,jackhammer
25,jackhammer
28,jackhammer
29,jackhammer
30,jackhammer
31,jackhammer
34,jackhammer
39,jackhammer
41,jackhammer
51,jackhammer
53,jackhammer
...
bis zum Ende, logo.

Und der Kommentar der Anleitung spricht für sich:

This seems like a good idea as a benchmark for any challenge, but for this problem, it seems a bit unfair. This is so because the dataset is not much imbalanced.

Folglich:

Let’s solve the challenge! Part 2: Building better models

Now let us see how we can leverage the concepts we learned above to solve the problem. We will follow these steps to solve the problem.

Step 1: Load audio filesStep 2: Extract features from audioStep 3: Convert the data to pass it in our deep learning modelStep 4: Run a deep learning model and get results

Below is a code of how I implemented these steps

Step 1 and 2 combined: Load audio files and extract features

Aha, jetzt kommt die Sache mit dem zunächst auskommentierten Pfad (siehe Vorgänger-Post!), bzw. lediglich der Pfad zum Ordner mit den Daten muss noch in der Variablen untergebracht werden:

data_dir = '/home/heide/python-wave-analysis/'
print(data_dir)

/home/heide/python-wave-analysis/

Und bei dem Definieren der Funktion muss auf die richtigen Einrückungen geachtet werden! Ich musste herumprobierbasteln, bis endlich keine Fehler mehr ausgegeben wurden (und stand dabei ganz schön im Regen, weil "noch keinen/kaum Plan" - wie wir alle wissen):

def parser(row):
    print("Reihe: ",row)
    print("str(row.ID)",str(row.ID))
    print("row.ID",row.ID)

    # function to load files and extract features
    file_name = os.path.join(os.path.abspath(data_dir), 'Train', str(row.ID) + '.wav')
    print("Dateiname: ", file_name)

    # handle exception to check if there isn't a file which is corrupted
    try:
        # here kaiser_fast is a technique used for faster extraction
        X, sample_rate = librosa.load(file_name, res_type='kaiser_fast')
        # we extract mfcc feature from data
        mfccs = np.mean(librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=40).T,axis=0)

    except Exception as e:
        print("Error encountered while parsing file: ", file)
        return None, None

    feature = mfccs
    label = row.Class

    return [feature, label]

Die "prints" habe ich mir eingebaut, weil es zunächst jede Menge Fehlermeldungen gegeben hat. Und als str(row.ID) wurde lediglich ID ausgegeben - ich musste erst noch einmal die Zuordnung train = pd.read_csv("/home/heide/python-wave-analysis/train.csv") neu ausführen - warum auch immer.

Das folgende startet dann die Anwendung der Funktion:

temp = train.apply(parser, axis=1)

Läuft ein bisschen. Sind ja einige Waves, die durchgecheckt werden.

Dateiname:  /home/heide/python-wave-analysis/Train/8725.wav
Reihe:  ID           8726
Class    dog_bark
Name: 5431, dtype: object
str(row.ID) 8726
row.ID 8726
Dateiname:  /home/heide/python-wave-analysis/Train/8726.wav
Reihe:  ID                8727
Class    engine_idling
Name: 5432, dtype: object
str(row.ID) 8727
row.ID 8727
Dateiname:  /home/heide/python-wave-analysis/Train/8727.wav
Reihe:  ID                8728
Class    engine_idling
Name: 5433, dtype: object
str(row.ID) 8728
row.ID 8728
Dateiname:  /home/heide/python-wave-analysis/Train/8728.wav

Reihe:  ID                  8729
Class    air_conditioner
Name: 5434, dtype: object
str(row.ID) 8729
row.ID 8729
Dateiname:  /home/heide/python-wave-analysis/Train/8729.wav

Fertig!
Und dann folgt:

temp.columns = ['feature', 'label']

temp.columns[0],temp.columns[1]

('feature', 'label')

So. Funzt also. Und nun?

Step 3: Convert the data to pass it in our deep learning model

from sklearn.preprocessing import LabelEncoder

X = np.array(temp.feature.tolist())
y = np.array(temp.label.tolist())

lb = LabelEncoder()

y = np_utils.to_categorical(lb.fit_transform(y))

AttributeError                            Traceback (most recent call last)
<ipython-input-149-1613f53e2d98> in <module>
      1 from sklearn.preprocessing import LabelEncoder
      2 
----> 3 X = np.array(temp.feature.tolist())
      4 y = np.array(temp.label.tolist())
      5 

~/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5065             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5066                 return self[name]
-> 5067             return object.__getattribute__(self, name)
   5068 
   5069     def __setattr__(self, name, value):

AttributeError: 'Series' object has no attribute 'feature'

Mist, jetzt stehe ich ernsthaft auf dem Schlauch. Bis hierher habe ich alle kleineren Problemchen durch die Ungenauigkeiten in dieser Anleitung lösen können, aber an dieser Stelle komme ich (nach etwa einer Stunde Suchen & Probieren) nicht weiter. Blöd. Viel fehlt nicht mehr bis zum Ende. Warum komme ich nur nie ans Ende bei diesen Deep-Learning-Model-Trainings? (War beim Rapidminer auch immer so ... )-: .)

Abwarten, Tee trinken. Mehr fällt mir grade nicht ein.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Oh, Mann! Ich hasse das.

Das habe ich jetzt herausgefunden:

from sklearn.preprocessing import LabelEncoder

temp.tolist()

wirft das hier aus (Ausschnitt):

[array([-192.07228   ,  132.09004   ,  -98.84943   ,    0.67741424,
          -25.435394  ,   -9.677651  ,  -31.589748  ,   -3.235866  ,
          -24.849829  ,   -5.7216616 ,  -23.516773  ,   -5.847936  ,
          -24.831627  ,  -10.386574  ,  -23.96386   ,  -17.12719   ,
          -21.741112  ,  -10.443719  ,  -11.343134  ,   -6.586451  ,
          -12.10374   ,  -11.110559  ,  -13.287957  ,  -13.126061  ,
          -15.36659   ,   -9.475932  ,   -8.290194  ,   -5.0611796 ,
           -5.2760816 ,   -7.8324695 ,   -7.3157797 ,   -5.873999  ,
           -2.988504  ,   -2.1846547 ,   -2.1187742 ,   -4.074119  ,
           -5.8642855 ,   -4.720396  ,   -1.1819433 ,   -2.6562045 ],
        dtype=float32), 'air_conditioner'],
 [array([-1.99528992e+02,  1.83668976e+02, -7.14284754e+00,  1.01079798e+01,
         -1.12582073e+01, -3.48339975e-01, -1.81892204e+01,  1.44609318e+01,
         -1.93229198e+01,  4.06136417e+00, -1.35920677e+01, -2.66213202e+00,
         -2.36060352e+01, -1.51015263e+01, -1.79608517e+01, -4.64226055e+00,
         -1.53724623e+01, -2.02773929e+00, -1.08224726e+00,  5.23184359e-01,
         -7.71550655e+00, -7.29981518e+00, -1.04677515e+01, -4.26738834e+00,
         -1.10490770e+01, -1.74581947e+01, -6.12203693e+00, -1.78527606e+00,
         -5.78033257e+00, -6.43251610e+00,  1.61077046e+00,  2.18034577e+00,
          1.02158523e+00, -5.08716106e+00, -1.24339879e-01, -5.41702843e+00,
         -1.12404287e+00, -1.16118360e+00, -5.49766123e-01, -3.77333021e+00],
        dtype=float32), 'air_conditioner'],
 [array([-363.90073   ,  176.40643   ,   42.35497   ,   44.845356  ,
           20.407352  ,   37.93451   ,    5.333189  ,   23.630617  ,
            4.0751557 ,   11.233556  ,    3.9465735 ,    4.8049235 ,
           -1.4119686 ,   -3.8330114 ,   -2.0721066 ,   -3.8421342 ,
           -6.660765  ,   -8.824786  ,   -1.8869449 ,   -6.5119634 ,
           -2.5806298 ,   -6.148154  ,   -3.3620937 ,   -4.1083584 ,
           -3.3536081 ,   -3.5247633 ,   -3.111913  ,   -3.25491   ,
           -3.8335528 ,   -3.0624204 ,   -1.0403792 ,   -1.7393869 ,
           -5.063504  ,   -1.2641332 ,   -0.51054686,   -3.520135  ,
           -3.4498098 ,   -1.9625036 ,   -1.7783222 ,   -4.239703  ],
        dtype=float32), 'engine_idling'],

Wahrscheinlich muss ich nur noch herausfinden, wie ich die definierten columns ansprechen kann ... tja, leicht gesacht ... noch nix Brauchbares in der Richtung gefunden ... oh, Mann, oh Mann!!!!

So scheint es jetzt zu gehen:

from sklearn.preprocessing import LabelEncoder

X = np.array(list(zip(*temp))[0])
y = np.array(list(zip(*temp))[1])

lb = LabelEncoder()

y = np_utils.to_categorical(lb.fit_transform(y))

Es scheint. Leider nicht so hell wie erhofft:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-222-c52aff56b588> in <module>
      6 lb = LabelEncoder()
      7 
----> 8 y = np_utils.to_categorical(lb.fit_transform(y))

NameError: name 'np_utils' is not defined

Ich glaub's nicht! Entweder bin ich doof, oder die Anleitung steckt voller Fehler (die Anfänger nicht erkennen), oder ich habe die falsche Python-Version, oder ...

# pip install np_utils
import np_utils

from sklearn.preprocessing import LabelEncoder

X = np.array(list(zip(*temp))[0])
y = np.array(list(zip(*temp))[1])

lb = LabelEncoder()

y = np_utils.to_categorical(lb.fit_transform(y))

erbringt:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-c52aff56b588> in <module>
      6 lb = LabelEncoder()
      7 
----> 8 y = np_utils.to_categorical(lb.fit_transform(y))

AttributeError: module 'np_utils' has no attribute 'to_categorical'

Toll, wah? Öfter mal was Neues.

Ich geb's auf! Scheisse, kurz vor fertig!!!

Das hier habe ich noch versucht, den einen nicht funzenden Schritt überspringend:

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import Adam
from keras.utils import np_utils
from sklearn import metrics

Auch da klappt nix mehr:

Using TensorFlow backend.

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-31-0460ca22e504> in <module>
      1 import numpy as np
----> 2 from keras.models import Sequential
      3 from keras.layers import Dense, Dropout, Activation, Flatten
      4 from keras.layers import Convolution2D, MaxPooling2D
      5 from keras.optimizers import Adam

~/anaconda3/lib/python3.7/site-packages/keras/__init__.py in <module>
      1 from __future__ import absolute_import
      2 
----> 3 from . import utils
      4 from . import activations
      5 from . import applications

~/anaconda3/lib/python3.7/site-packages/keras/utils/__init__.py in <module>
      4 from . import data_utils
      5 from . import io_utils
----> 6 from . import conv_utils
      7 
      8 # Globally-importable utils.

~/anaconda3/lib/python3.7/site-packages/keras/utils/conv_utils.py in <module>
      7 from six.moves import range
      8 import numpy as np
----> 9 from .. import backend as K
     10 
     11 

~/anaconda3/lib/python3.7/site-packages/keras/backend/__init__.py in <module>
----> 1 from .load_backend import epsilon
      2 from .load_backend import set_epsilon
      3 from .load_backend import floatx
      4 from .load_backend import set_floatx
      5 from .load_backend import cast_to_floatx

~/anaconda3/lib/python3.7/site-packages/keras/backend/load_backend.py in <module>
     87 elif _BACKEND == 'tensorflow':
     88     sys.stderr.write('Using TensorFlow backend.\n')
---> 89     from .tensorflow_backend import *
     90 else:
     91     # Try and load external backend.

~/anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py in <module>
      3 from __future__ import print_function
      4 
----> 5 import tensorflow as tf
      6 from tensorflow.python.framework import ops as tf_ops
      7 from tensorflow.python.training import moving_averages

ModuleNotFoundError: No module named 'tensorflow'

Schade, zu früh über eine anscheinend ganz tolle Anleitung gefreut!

In den Kommentaren wird auch fast nur gelobt, bloss ich musste offenbar mal wieder voll in die K... greifen ... grrrrrrrrrrrrrrrrrrrr.

Getting Started with Audio Data Analysis (Voice) using Deep Learning

40 Comments

kishor Peddolla

August 24, 2017 at 12:38 pm
Hi Faizan,
It was great explanation thank you. and i am working like same problem but it is on the financial(bank customer) speech recognition problem, would you please help on this,
Thank you in advance
Regards,
Kishor Peddolla

Reply
- Faizan Shaikh
  
  August 24, 2017 at 5:06 pm
  Hey Kishor,
  Sure! Your problem seems interesting.
  I might add that Speech recognition is more complex than audio classification, as it involves natural language processing too. Can you explain what approach you followed as of now to solve the problem?
  Also, I would suggest creating a thread on discussion portal so that more people from the community could contribute to help you
  
  Reply
Karthikeyan Sankaran

August 24, 2017 at 12:47 pm
Nice article, Faizan. Gives a good foundation to exploring audio data. Keep up the good work. Thanks
Regards
Karthik

Reply
- Faizan Shaikh
  
  August 24, 2017 at 4:54 pm
  Thanks Karthikeyan
  
  Reply
Kalyanaraman

August 24, 2017 at 12:49 pm
Thanks. This is something I had been thinking for sometime.

Reply
- Faizan Shaikh
  
  August 24, 2017 at 4:55 pm
  Thanks kalyanaraman
  
  Reply
Manoj

August 24, 2017 at 5:09 pm
Nice article. I liked the introduction to python libraries for audio. Any chance, you cover hidden markov models for audio and related libraries. Thank you

Reply
- Faizan Shaikh
  
  August 26, 2017 at 12:28 pm
  Thanks Manoj! I’ll try to cover this in the next article
  
  Reply
Georgios Sarantitis

August 24, 2017 at 6:02 pm
Hello Faizan and thank you for your introduction to sound recognition and clustering! Just a kind remark, I noticed that you have imported the Convolutional and maxpooling layers which you do not use so I guess there’s no need for them to be there….But I did say WOW when I saw them – I thought you would implement a CNN solution…

Reply
Nagu

August 24, 2017 at 9:33 pm
Hi Faizan
This is a very good article to get started on Audio analysis. I do not think any other books out there could have given this type of explanation ! Keep up the great work !!!

Reply
- Faizan Shaikh
  
  August 26, 2017 at 12:27 pm
  Thanks Nagu
  
  Reply
Krish

August 25, 2017 at 12:19 am
Great Work! Appreciate your effort in documenting this.

Reply
- Faizan Shaikh
  
  August 26, 2017 at 12:27 pm
  Thanks Krish
  
  Reply
Gowri

August 26, 2017 at 3:25 pm
Great work faizan! I did go through this article and I find that most of machine learning articles require extensive knowledge of dataset or domain : like speech here. How does one do that and how do you decide to work on such problems ? Any references? I usually tend to follow moocs, but how to do self research and design end to end processes especially for machine learning?

Reply
- Faizan Shaikh
  
  September 6, 2017 at 9:01 pm
  Hi Gowri,
  You are right to say that data science problems involve domain knowledge to solve problems, and this comes from experience in working on those kind of problems. When I take up a problem, I try to do as much research as I can and also, try to get hands on experience in it.
  Each person has his or her own learning process. So my process may or may not work for you. Still I would suggest a course that would help you https://www.coursera.org/learn/learning-how-to-learn
  
  Reply
  - Uraj singh
    
    November 13, 2017 at 12:35 pm
    Thanks for suggesting the wonderful course !!
    
    Reply
Darli Yang

September 5, 2017 at 6:40 pm
Hi Faizan,
I got the following result, would you give some solutions to me:
In [132]: model.fit(X, y, batch_size=32, epochs=5)
Traceback (most recent call last):
File “”, line 1, in
model.fit(X, y, batch_size=32, epochs=5)
File “C:\Users\admin\Anaconda2\lib\site-packages\keras\models.py”, line 867, in fit
initial_epoch=initial_epoch)
File “C:\Users\admin\Anaconda2\lib\site-packages\keras\engine\training.py”, line 1522, in fit
batch_size=batch_size)
File “C:\Users\admin\Anaconda2\lib\site-packages\keras\engine\training.py”, line 1378, in _standardize_user_data
exception_prefix=’input’)
File “C:\Users\admin\Anaconda2\lib\site-packages\keras\engine\training.py”, line 144, in _standardize_input_data
str(array.shape))
ValueError: Error when checking input: expected dense_7_input to have shape (None, 40) but got array with shape (5435L, 1L)

Reply
- Faizan Shaikh
  
  November 16, 2017 at 4:46 pm
  The input which you give to the neural network is improper. You can answer the following questions to get the answer
  1. What is the shape of input layer?
  2. What is the shape of X?
  
  Reply
  - Darli
    
    November 22, 2017 at 8:24 pm
    I have solved this problem, Thanks!
    
    Reply
Phani

September 16, 2017 at 2:29 am
Thank you for the great explanation. Do you mind making the source code including data files and iPython notebook available through gitHub?

Reply
- Faizan Shaikh
  
  September 19, 2017 at 12:02 pm
  Sure. Will do
  
  Reply
  - Phani
    
    September 25, 2017 at 9:01 pm
    Hi Faizan,
    A friendly reminder about the ipython notebook you promised. Here is the reason for my curiosity. While experimenting with urban sound dataset (https://serv.cusp.nyu.edu/projects/urbansounddataset/urbansound8k.html), with an identical deep feed forward neural network like yours, the best accuracy I have achieved is 65%.
    That is after lots of hyper parameterization. I know in this blog you have reported similar accuracy and further alluded that you could achieve 80% accuracy. That is impressive, and I am aiming for similar result. However, I have noticed your dataset size is not the full 8K set. In my experimentation, I am using audio folders1-8 for training, folder 9 for validation and folder 10 for testing. I get 65% accuracy both on the validation and testing sets.
    Hope you could share your notebook or help me towards 80% accuracy goal. While I am currently experimenting with data augmentation, your help is much appreciated. I am aiming for this higher accuracy before using the trained model/parameters for a custom project of mine to classify a personal audio dataset.
    Thank you in advance,
    Phani.
    
    Reply
    - Phani
      
      September 25, 2017 at 9:25 pm
      forgot to mention, for my training I am extracting 5 different datapoints (mfccs,chroma,mel,contrast,tonnetz) not just one (mfccs) like you did. With this fullset I get 65% accuracy. With mfccs alone I get only 53%. Also, 60% is the highest I saw so far in various other blogs with similar dataset. Interestingly convoluted networks (CNN) with mel features alone could not push this any further, making your results of 80% that much more impressive.
      Look forward to seeing your response.
      Thank you in advance.
      
      Reply
Smitha

October 26, 2017 at 11:42 am
Nice article… even I want to classify normal and pathological voice samples using keras… if I get any difficulty please help me regarding this….

Reply
- Faizan Shaikh
  
  November 16, 2017 at 4:58 pm
  Sure
  
  Reply
Sourish

November 16, 2017 at 11:09 am
Hi Faizan,
Thank you for introducing this concept. However there is a basic problem,I am facing.
I can’t install librosa, as every time I typed import librosa I got AttributeError: module ‘llvmlite.binding’ has no attribute ‘get_host_cpu_name’. I googled a lot, but didn’t find a solution for this. Can you please provide a solution here, so that I can proceed further.
Thanks

Reply
- Faizan Shaikh
  
  November 16, 2017 at 4:51 pm
  Hi, A solution to similar issue was to reinstall llvm package by executing sudo apt-get install llvm
  
  Reply
  - Sourish
    
    November 17, 2017 at 12:10 pm
    Tried with that, however not solved the problem.mine is windows OS with anaconda environment.
    Thanks
    
    Reply
    - Faizan Shaikh
      
      November 17, 2017 at 4:40 pm
      As a last resort, you can rely on a docker system for testing out the code
      
      Reply
Toke Hiber

April 11, 2018 at 6:48 am
Hi sir.
Thanks for this nice article. But how to I get datasets?

Reply
- louisCC
  
  April 17, 2018 at 3:05 pm
  Hello
  You can find the dataset here : https://drive.google.com/drive/folders/0By0bAi7hOBAFUHVXd1JCN3MwTEU
  
  Reply
LouisCC

April 17, 2018 at 2:54 pm
Hi,
How do you read train.scv to get train variable ?
Thank You in advance
Louis

Reply
- Aishwarya Singh
  
  April 20, 2018 at 3:41 pm
  Hi Louis,
  The link for the dataset is provided in the article itself. you can download it from there.
  
  Reply
Maxwel

April 18, 2018 at 8:16 am
Can i get the dataset please

Reply
- Aishwarya Singh
  
  April 19, 2018 at 1:20 pm
  Hi Maxwel,
  The link to the dataset is provided in the article itself.
  
  Reply
Houda Abzd

April 18, 2018 at 9:29 pm
Hi, I would like to use your example for my problem which is the separation of audio sources , I have some troubles using the code because I don’t know what do you mean by “train” , and also I need your data to run the example to see if it is working in my python, so can you plz provide us all the data through gitHub?

Reply
- Aishwarya Singh
  
  April 20, 2018 at 3:53 pm
  Hi Houda,
  The dataset has two parts, train and test. The link to download the datasets is provided in the article itself.
  
  Reply
Houda bzd

April 20, 2018 at 4:13 pm
Hi, thanks for the nice article,
I have a problem dealing with the code, it gives me “name ‘train’ is not defined” even I have the dataset , can you help me plz ?
Best.

Reply
- Aishwarya Singh
  
  April 23, 2018 at 1:46 pm
  Hi,
  Glad you liked the article.
  Also, check the name you have set for the dataset you’re trying to load. I guess it should be ‘Train’, not ‘train’
  
  Reply
Houda Abzd

April 21, 2018 at 9:17 pm
Hi Aishwarya ,
First of all , thanks for your feedback, I download the data, otherwise, I get this error: TypeError: ‘<' not supported between instances of 'NoneType' and 'str' , this error comes with this command:
y = np_utils.to_categorical(lb.fit_transform(y))
knowing that I am using python 3.6. any help or suggestion I will be upreciating that
Best.

Reply

Get access to free courses on Analytics Vidhya
Get free downloadable resource from Analytics Vidhya
Save your articles
Participate in hackathons and win prizes

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Na, gut. Halten wir mal fest, daß ich immerhin schon X und y ermittelt habe.

X

array([[-8.21235886e+01,  1.39505951e+02, -4.24308472e+01, ...,
        -5.02371311e-01, -2.60428572e+00, -1.05346346e+00],
       [-1.57440014e+01,  1.24119957e+02, -2.94288826e+01, ...,
         8.23705137e-01,  1.71932185e+00, -3.31463337e-01],
       [-1.23393654e+02,  1.51819458e+01, -5.00933380e+01, ...,
         1.57439041e+00, -3.99674249e+00,  3.24575043e+00],
       ...,
       [-3.20817383e+02,  5.83005714e+01, -6.86950493e+00, ...,
         9.27267194e-01, -4.82822418e+00,  2.27338743e+00],
       [-2.77031799e+02,  1.41145706e+02, -3.63291206e+01, ...,
        -8.68412673e-01,  1.51832926e+00,  1.36296940e+00],
       [-2.90996765e+02,  2.33351517e+02, -4.67234850e+00, ...,
         2.80950874e-01, -3.52302939e-01, -1.58164389e-02]], dtype=float32)

array(['siren', 'street_music', 'drilling', ..., 'engine_idling',
       'engine_idling', 'air_conditioner'], dtype='<U16')

Und:

X[111]

array([-189.02544   ,  114.033455  ,  -37.891834  ,   16.155037  ,
          0.76943874,    1.850578  ,   -0.60028684,    2.008078  ,
        -13.304759  ,    8.057442  ,  -21.566187  ,    0.51479197,
         -9.628775  ,   17.542027  ,   22.567793  ,   45.152508  ,
         32.81607   ,   32.63175   ,   -7.0137525 ,   -8.0249195 ,
         -7.744082  ,  -24.28129   ,  -26.888697  ,    7.0361714 ,
          1.5578309 ,  -10.797445  ,   -4.3337317 ,   17.100023  ,
         20.26767   ,   -5.6094756 ,  -33.247475  ,    0.5373518 ,
         29.878405  ,   24.322805  ,    4.160377  ,   -4.7325335 ,
        -16.90753   ,  -40.645527  ,  -11.117407  ,   36.761894  ],
      dtype=float32)

y[111]

'car_horn'

X[222]

array([-3.4413467e+02,  1.4151398e+02, -3.6571327e+01,  1.9919697e+01,
       -4.8929400e+00,  2.9524536e+01,  9.4969282e+00,  1.9131453e+01,
       -1.8282511e+01,  1.9138834e+01, -1.3688730e-01,  7.1381822e+00,
       -1.2017831e+01,  4.6602187e+00, -4.2000794e+00,  9.3513689e+00,
       -1.0048443e+01,  1.6354413e+00, -6.7676792e+00,  6.0454297e+00,
       -4.6748767e+00, -3.2391195e+00, -3.7826231e+00,  3.8544788e+00,
       -1.9037149e+00,  1.1317869e+00, -5.1420946e+00,  7.8249043e-01,
       -1.2907560e+00,  1.4316612e+00, -5.1457925e+00,  6.2091279e-01,
       -1.3239324e+00,  7.4716955e-01, -4.0700312e+00, -1.5556673e+00,
       -1.8876700e+00,  1.1014223e+00, -1.8403244e+00, -2.0655949e+00],
      dtype=float32)

y[222]

'siren'

print('Class: ', train.Class[222])
x, sr = librosa.load('/home/heide/python-wave-analysis/Train/' + str(train.ID[222]) + '.wav')

plt.figure(figsize=(12, 4))
librosa.display.waveplot(x, sr=sr)

Class:  siren

Out[38]:

<matplotlib.collections.PolyCollection at 0x7f029062f240>

Ist ja auch schon besser als nix.

Dieses Blog durchsuchen

Es gibt ...

Let's leverage!

Let’s solve the challenge! Part 2: Building better models

Step 1 and 2 combined: Load audio files and extract features

Step 3: Convert the data to pass it in our deep learning model

40 Comments

Kommentare

Kommentar veröffentlichen

Beliebte Posts aus diesem Blog

Betreuervergütung – Betreuungsrecht-Lexikon

Ich schreibe wie?

Praktisch erledigt.

Let's leverage!

Let’s solve the challenge! Part 2: Building better models

Step 1 and 2 combined: Load audio files and extract features

Step 3: Convert the data to pass it in our deep learning model

40 Comments

Join the nextgen data science ecosystem

Kommentare

Kommentar veröffentlichen

Beliebte Posts aus diesem Blog

Betreuervergütung – Betreuungsrecht-Lexikon

Ich schreibe wie?

Praktisch erledigt.