In: Computer Science
Deep leraning/LSTM/Matlab
There is a Matlab code that is doing the following steps for deep learning and applying LSTM, I need to change first three steps to use our dataset to train this model and you don't need to change other.
I need to apply that for .ogg audio files so Create and Use some audio files with .ogg format as sample data and give me the code.
The following steps is for your information:
Code:
fs = 44.1e3;
duration = 0.5;
N = duration*fs;
wNoise = 2*rand([N,1000]) - 1;
wLabels = repelem(categorical("white"),1000,1);
bNoise = filter(1,[1,-0.999],wNoise);
bNoise = bNoise./max(abs(bNoise),[],'all');
bLabels = repelem(categorical("brown"),1000,1);
pNoise = pinknoise([N,1000]);
pLabels = repelem(categorical("pink"),1000,1)
sound(wNoise(:,1),fs)
melSpectrogram(wNoise(:,1),fs)
title('White Noise')
sound(bNoise(:,1),fs)
melSpectrogram(bNoise(:,1),fs)
title('Brown Noise')
sound(pNoise(:,1),fs)
melSpectrogram(pNoise(:,1),fs)
title('Pink Noise')
featuresTrain = extract(aFE,audioTrain);
[numHopsPerSequence,numFeatures,numSignals] = size(featuresTrain)
audioTrain = [wNoise(:,1:800),bNoise(:,1:800),pNoise(:,1:800)];
labelsTrain = [wLabels(1:800);bLabels(1:800);pLabels(1:800)];
audioValidation = [wNoise(:,801:end),bNoise(:,801:end),pNoise(:,801:end)];
labelsValidation = [wLabels(801:end);bLabels(801:end);pLabels(801:end)];
aFE = audioFeatureExtractor("SampleRate",fs, ...
"SpectralDescriptorInput","melSpectrum", ...
"spectralCentroid",true, ...
"spectralSlope",true);
featuresTrain = permute(featuresTrain,[2,1,3]);
featuresTrain = squeeze(num2cell(featuresTrain,[1,2]));
numSignals = numel(featuresTrain)
[numFeatures,numHopsPerSequence] = size(featuresTrain{1})
featuresValidation = extract(aFE,audioValidation);
featuresValidation = permute(featuresValidation,[2,1,3]);
featuresValidation = squeeze(num2cell(featuresValidation,[1,2]));
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(50,"OutputMode","last")
fullyConnectedLayer(numel(unique(labelsTrain)))
softmaxLayer
classificationLayer];
options = trainingOptions("adam", ...
"Shuffle","every-epoch", ...
"ValidationData",{featuresValidation,labelsValidation}, ...
"Plots","training-progress", ...
"Verbose",false);
net = trainNetwork(featuresTrain,labelsTrain,layers,options);
wNoiseTest = 2*rand([N,1]) - 1;
classify(net,extract(aFE,wNoiseTest)')
bNoiseTest = filter(1,[1,-0.999],wNoiseTest);
bNoiseTest= bNoiseTest./max(abs(bNoiseTest),[],'all');
classify(net,extract(aFE,bNoiseTest)')
pNoiseTest = pinknoise(N);
classify(net,extract(aFE,pNoiseTest)')
sound(wNoise(:,1),fs) title('White Noise')
melSpectrogram(wNoise(:,1),fs)
sound(bNoise(:,1),fs) title('Brown Noise')
melSpectrogram(bNoise(:,1),fs)
sound(pNoise(:,1),fs) title('Pink Noise')
melSpectrogram(pNoise(:,1),fa)
audioTrain = [wNoise(:,1:800),bNoise(:,1:800),pNoise(:,1:800)]; labelsTrain = [wLabels(1:800);bLabels(1:800);pLabels(1:800)];
audioValidation = [wNoise(:,801:end),bNoise(:,801:end),pNoise(:,801:end)]; labelsValidation = [wLabels(801:end);bLabels(801:end);pLabels(801:end)];
aFE = audioFeatureExtractor("SampleRate",fs, ... "SpectralDescriptorInput","melSpectrum", ... "spectralCentroid",true, ... "spectralSlope",true);
featuresTrain = extract(aFE,audioTrain); [numHopsPerSequence,numFeatures,numSignals] = size(featuresTrain)
featuresTrain = permute(featuresTrain,[2,1,3]); featuresTrain = squeeze(num2cell(featuresTrain,[1,2])); numSignals = numel(featuresTrain)
numFeatures,numHopsPerSequence] = size(featuresTrain{1})
featuresValidation = extract(aFE,audioValidation); featuresValidation = permute(featuresValidation,[2,1,3]); featuresValidation = squeeze(num2cell(featuresValidation,[1,2]));
layers = [ ...
sequenceInputLayer(numFeatures) lstmLayer(50,"OutputMode","last") fullyConnectedLayer(numel(unique(labelsTrain))) softmaxLayer classificationLayer];
options = trainingOptions("adam", ... "Shuffle","every-epoch", ... "ValidationData",{featuresValidation,labelsValidation}, ... "Plots","training-progress", ... "Verbose",false);
net = trainNetwork(featuresTrain,labelsTrain,layers,options);
For testing the network:
wNoiseTest = 2*rand([N,1]) - 1; classify(net,extract(aFE,wNoiseTest)')
bNoiseTest = filter(1,[1,-0.999],wNoiseTest); bNoiseTest= bNoiseTest./max(abs(bNoiseTest),[],'all'); classify(net,extract(aFE,bNoiseTest)')
pNoiseTest = pinknoise(N); classify(net,extract(aFE,pNoiseTest)')
In first go
Your answer will be:
ans= categorial white
In second go
Your answer will be:
ans= categorial brown
In third go
Your answer will be:
ans= categorial pink