osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Using the same data for both validation and prediction in Keras


Amirreza Heidari <amirrezaheidarysbu at gmail.com> writes:

> I was reading a tutorial for time series prediction by Neural
> Networks. I found that this code have used the same test data in the
> following code for validation, and later also for prediction.
>
> history = model.fit(train_X, train_y, epochs=50, batch_size=72, validation_data=(test_X, test_y), verbose=2, shuffle=False)
>
> Does it mean that the validation and test data are the same, or there is a default percentage to split the data into validation and prediction?

As per Prof. Andrew Ng, training, cross-validation and testing should
have three different data-sets. If you have a small example set (for
example 10,000 or may be 50,000) then you can split the example set into
60:20:20 ratio for train:validation:testing. But if you have a very
large data-set (1 million, 10 million) then consider using 1% or may be
lesser for validation and testing.

-- 
Pankaj Jangid