Output is NaN #77

alexnix · 2016-10-25T15:28:34Z

I train my network on a set of data containing car data (year of fabrication, mileage, type, model as input and price as output). I try to predict price for another car but output is NaN. NaN is not even among the values in the training set so this seams like an issue with the brain module.

My code is on GitHib Gist, here: https://gist.github.com/alexnix/146fea914501d283c80635087dd87036

nickpoorman · 2016-10-25T15:33:56Z

@alexnix - Your input type and model are non-numeric values. You must normalize your data first. I suggest one-hot encoding them: https://github.com/nickpoorman/one-hot

alexnix · 2016-10-25T16:13:47Z

One-hot encoding for type and model?

I just learnt from YouTube Video that input values must be in [-1,1]. I am still wondering how to represent model and type, I could just label them with numbers like Audi 0.1, Opel 0.2 and so on but this will make the network find Audi and Opel similar because their labels are close to each other. Is there a god way to represent inputs with discrete, not correlated values? (such as car model, in my example)

Disclaimer: noob in AI/ML/NN here.

nickpoorman · 2016-10-25T16:49:18Z

@alexnix - Yes one-hot encoding solves the issue of having a feature(s) with "categorical" values. In your case for example, type will be expanded from a vertical column in the matrix, of "Mazda", "Ford", "Volkswagen", "Renault", "Kia", "Hyundai", etc... to horizontal, with a boolean flag of 1 if it is that type.

For example: Your first input is: { year: 2009, mileage: 311000, type: "Mazda", model: "CX-7" }

So it will become something like:

[2009, 311000, 1, 0, 0, 0, 0, 0, ...]

Where the header for the columns might be:

[year, mileage, type_mazda, type_ford, type_volkswagen, type_renault, type_kia, type_hyundai, ...]

You might also want to normalize your values between 0 and 1 for this library. Get the min and max value for each column and then scale them between 0 and 1. https://github.com/nickpoorman/scale-number-range

alexnix · 2016-10-25T17:24:01Z

Thank you for your advice, it was very useful indeed.

robertleeplummerjr · 2016-12-21T14:09:56Z

Is this issue resolved?

Dok11 · 2016-12-21T15:37:13Z

@nickpoorman, can I ask you?
You wrote:

Where the header for the columns might be:
[year, mileage, type_mazda, type_ford, type_volkswagen, type_renault, type_kia, type_hyundai, ...]

Is it mean what topicstarter must set and "model" by your example? Like this:
{model__maxda_cx_7: 1, model__bmw_x5: 0, model__nissan_xtrail: 0,...}

It's so many columns.. that's normal?

nickpoorman · 2016-12-21T15:58:15Z

@Dok11, brain.js only allows for numeric values as inputs. One-Hot encoding allows you to transform Y-axis values into X-axis inputs with "ON or OFF" values.

This will naturally increase the dimensionality of the the inputs, so yes you will always end up with more columns. To reduce the number of columns, run your data set through dimensionality reduction via PCA, Lasso, or some other means.

This brain library is probably not what you want for highly dimensional data. Try using a library that can do matrix transforms quickly via BLAS or some other more efficient means.

robertleeplummerjr · 2016-12-21T16:10:17Z

@nickpoorman & @Dok11, the new repository does do matrix transforms via the recurrent neural net...
Example from: BrainJS/brain.js@338cf70#diff-04c6e90faac2675aa89e2176d2eec7d8R25

//create a simple recurrent neural network
var net = new brain.recurrent.RNN();

net.train([{input: [0, 0], output: [0]},
           {input: [0, 1], output: [1]},
           {input: [1, 0], output: [1]},
           {input: [1, 1], output: [0]}]);
	
var output = net.run([0, 0]);  // [0]
output = net.run([0, 1]);  // [1]
output = net.run([1, 0]);  // [1]
output = net.run([1, 1]);  // [0]

Dok11 · 2016-12-21T18:13:43Z

@robertleeplummerjr, that's cool, but not for this task, right?
p.s. ye, I do very similar nn, and this ask very interest for me ;)

Dok11 · 2016-12-21T18:22:16Z

pps. Where I can see more examples?
https://github.com/harthur-org/brain.js/wiki is empty...

Dok11 · 2016-12-21T18:25:47Z

@nickpoorman: This will naturally increase the dimensionality of the the inputs, so yes you will always end up with more columns. To reduce the number of columns, run your data set through dimensionality reduction via PCA, Lasso, or some other means.

What about set dictinary as:
['cx-7', 'x5', 'x-trail', ...]
and use keys in input array like:
{... type: 0, ...}

Will this right work?

cawa-93 · 2017-03-06T09:15:14Z

I have the same problem, but all of the input signals are already normalize:

const neural = require('../NeuralNetwork').toFunction() // In this directory are stored neural network and an array of learning
neural({
  albums: 0.011111111111111112,
  videos: 0.016523867809057527,
  audios: 0,
  notes: 0,
  photos: 0.00035337249878528203,
  friends: 0.009302790837251175,
  mutual_friends: 0,
  followers: 0.007113002799187086,
  subscriptions: 0,
  pages: 0.0063083522583901085,
  wall: 0.0005448000778285826
}) // { '0': NaN }

An example of learning sample:

{
  "input":{
    "albums":0,
    "videos":0.002345981232150143,
    "audios":0,
    "notes":0,
    "photos":0.019921374619020275,
    "friends":0.06461938581574472,
    "mutual_friends":0,
    "followers":0.004280263813796541,
    "subscriptions":0,
    "pages":0.0010093363613424174,
    "wall":0.22041054577293512
  },
  "output":[0]
}

When training, I use only one an element of the training sample - then the network will take you back a numerical result. But if you use at least 2 Elements - the result is not a number

Full train array in .json
All data were normalized using scale-number-range

nickpoorman · 2017-03-06T13:30:54Z

@cawa-93 - I would have to take a look at the rest of your code - the setup of the network and how you are training the model. Another thing you should try is not using category mode. Simply supply your input vector as an array. Instead of this:

{
  "input":{
    "albums":0,
    "videos":0.002345981232150143,
    "audios":0,
    "notes":0,
    "photos":0.019921374619020275,
    "friends":0.06461938581574472,
    "mutual_friends":0,
    "followers":0.004280263813796541,
    "subscriptions":0,
    "pages":0.0010093363613424174,
    "wall":0.22041054577293512
  },
  "output":[0]
}

do this:

{
  "input":[
    0,
    0.002345981232150143,
    0,
    0,
    0.019921374619020275,
    0.06461938581574472,
    0,
    0.004280263813796541,
    0,
    0.0010093363613424174,
    0.22041054577293512
  ],
  "output":[0]
}

I've been using this in production for three years (training millions of models and making billions of predictions monthly), I assure you there is nothing wrong with the library.

cawa-93 · 2017-03-06T14:38:28Z

@nickpoorman I create simple repository for you cawa-93/user-scaner

I noticed if the train network objects, the numerical data stored in net.json, however, if the I train arrays, all values = Null

robertleeplummerjr · 2017-03-06T14:47:07Z

Your letting negative values return from scaleNumberRange which I assume is your means of normalizing values.

/**
 * simple module to scale a number from one range to another
 */
var debug = require('debug')('scale-number-range');

module.exports = function scaleNumberRange(number, oldMin, oldMax, newMin, newMax) {
  if (process.env.SCALE_THROW_OOB_ERRORS) {
    if (number < oldMin) {
      debug('ERROR OOB - scale(%d, %d, %d, %d, %d)', number, oldMin, oldMax, newMin, newMax);
      throw new Error('number is less than oldMin');
    }
    if (number > oldMax) {
      debug('ERROR OOB - scale(%d, %d, %d, %d, %d)', number, oldMin, oldMax, newMin, newMax);
      throw new Error('number is greater than oldMax');
    }
  }
  const result = (((newMax - newMin) * (number - oldMin)) / (oldMax - oldMin)) + newMin;
  console.log(result);
  return result;
}

Outputs:

$ babel-node --presets es2015-node ./test
-1
-0.9953080375356997
-1
-1
-0.9601572507619595
-0.8707612283685106
-1
-0.9914394723724069
-1
-0.9979813272773151
-0.5591789084541298
{ '0': NaN }

robertleeplummerjr · 2017-03-06T14:49:47Z

If I *= -1 result, I still get NaN, so still investigating.

nickpoorman · 2017-03-06T15:34:07Z

@cawa-93 - Two issues with your code. One you should filter out any user data that doesn't have the same shape. The following is going to cause issues.

{
  "id": 305576398,
  "counters": {
    "unknown": true
  }
}

To do this use a filter:

const learnArray = users
.filter(user => {
  for (let key in maxRages) {
    if (typeof user.counters[key] === 'undefined') {
      return false
    }
  }
  return true
})
.map(user => {
  let result = {
    input: {},
    output: []
  }

  for (let c in user.counters) {
    if (c !== 'messages' && c !== 'online_friends') {
      result.input[c] = scale(user.counters[c], 0, maxRages[c], 0, 1)
    }
  }

  result.output.push(user.counters.messages > 3 ? 1 : 0)

  return result
})

Also, you should scale to [0, 1] instead of [-1, -1].

Lastly, instead of using toFunction(), you should just use run to solve your NaN problem.

I've updated some of the code in this gist: https://gist.github.com/nickpoorman/cd9465edca726df8dc06dbdd2937d153

robertleeplummerjr · 2017-03-06T16:02:29Z

lol, beat me to it! In all fairness, I was getting a haircut.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output is NaN #77

Output is NaN #77

alexnix commented Oct 25, 2016

nickpoorman commented Oct 25, 2016

alexnix commented Oct 25, 2016

nickpoorman commented Oct 25, 2016 •

edited

Loading

alexnix commented Oct 25, 2016

robertleeplummerjr commented Dec 21, 2016

Dok11 commented Dec 21, 2016 •

edited

Loading

nickpoorman commented Dec 21, 2016

robertleeplummerjr commented Dec 21, 2016 •

edited

Loading

Dok11 commented Dec 21, 2016

Dok11 commented Dec 21, 2016

Dok11 commented Dec 21, 2016 •

edited

Loading

cawa-93 commented Mar 6, 2017 •

edited

Loading

nickpoorman commented Mar 6, 2017

cawa-93 commented Mar 6, 2017 •

edited

Loading

robertleeplummerjr commented Mar 6, 2017

robertleeplummerjr commented Mar 6, 2017

nickpoorman commented Mar 6, 2017

robertleeplummerjr commented Mar 6, 2017

Output is NaN #77

Output is NaN #77

Comments

alexnix commented Oct 25, 2016

nickpoorman commented Oct 25, 2016

alexnix commented Oct 25, 2016

nickpoorman commented Oct 25, 2016 • edited Loading

alexnix commented Oct 25, 2016

robertleeplummerjr commented Dec 21, 2016

Dok11 commented Dec 21, 2016 • edited Loading

nickpoorman commented Dec 21, 2016

robertleeplummerjr commented Dec 21, 2016 • edited Loading

Dok11 commented Dec 21, 2016

Dok11 commented Dec 21, 2016

Dok11 commented Dec 21, 2016 • edited Loading

cawa-93 commented Mar 6, 2017 • edited Loading

nickpoorman commented Mar 6, 2017

cawa-93 commented Mar 6, 2017 • edited Loading

robertleeplummerjr commented Mar 6, 2017

robertleeplummerjr commented Mar 6, 2017

nickpoorman commented Mar 6, 2017

robertleeplummerjr commented Mar 6, 2017

nickpoorman commented Oct 25, 2016 •

edited

Loading

Dok11 commented Dec 21, 2016 •

edited

Loading

robertleeplummerjr commented Dec 21, 2016 •

edited

Loading

Dok11 commented Dec 21, 2016 •

edited

Loading

cawa-93 commented Mar 6, 2017 •

edited

Loading

cawa-93 commented Mar 6, 2017 •

edited

Loading