Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to generate X,Y dataset? #7

Open
madan0511 opened this issue Nov 14, 2019 · 2 comments
Open

How to generate X,Y dataset? #7

madan0511 opened this issue Nov 14, 2019 · 2 comments

Comments

@madan0511
Copy link

How do you create X,Y dataset for training, should we add arrays of voices and then save it as that or should we create an array file for train.wav?

@IhabBendidi
Copy link

IhabBendidi commented Dec 9, 2019

That depends on your goal. If you are trying to just use the model for the "activate" word, just follow the tutorial exactly, while downloading the Data.zip that is in the Readme.
If you are trying to develop a custom model with your own trigger word, creating the training data would come with these steps :

  • In the raw_data/activates put your new wav files that got your new trigger word. The more you put the better. Leave the other folders as they are.
  • Modify the In [19] cell, by going to the number_of_activates variable initialisation. In its initialization, you would find the random number initialized between 0 and 5, change the '5' number, and make it into the number of custom wav files you added into the activates folder.
  • In the notebook, instead of loading the data as in the cell : In [23] , you could try something similar to this code :

for background in backgrounds:
for i in range(0,100):
x, y = create_training_example(background, activates, negatives)
X.append(x)
Y.append(y)

  • This would make X and Y arrays with all your data. you now would need to convert them into np.arrays. Convert them with the np.array function, and then you will need to change their shape to the needed shape, that is (nb_observations,1375,1) for Y and (nb_observations,5511,101). Use the np.reshape function for that. replace nb_observations with the size of your earlier array of X and Y.

  • Now you need to divide X and Y into training and test datasets. take the 80% first observations as training, and add the 20% that is left to new variables that are X_dev and Y_dev for test.

  • Now you're set to continue the with the notebook with no problem

@SaudAlmajed
Copy link

how to reshape X to (5511,101)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants