Fixes for float16 problems in the DLT#172
Conversation
| avg_acceptance_rate = sharedX(target_acceptance_rate, | ||
| 'avg_acceptance_rate') | ||
| s_rng = TT.shared_randomstreams.RandomStreams(seed) | ||
| s_rng = theano.sandbox.rng_mrg.MRG_RandomStreams(seed) |
There was a problem hiding this comment.
AttributeError: 'module' object has no attribute 'sandbox'
code/rbm.py
Outdated
| self.input * T.log(T.nnet.sigmoid(pre_sigmoid_nv)) + | ||
| (1 - self.input) * T.log(1 - T.nnet.sigmoid(pre_sigmoid_nv)), | ||
| self.input * T.log(T.nnet.sigmoid(pre_sigmoid_nv) + 0.001) + | ||
| (1 - self.input) * T.log(1 - T.nnet.sigmoid(pre_sigmoid_nv) + 0.001), |
There was a problem hiding this comment.
Would it make sence to add in theano a log_sigmoid like log_softmax?
There was a problem hiding this comment.
I dropped that part since the root of the problem was elsewhere.
| c.append(train_da(batch_index)) | ||
|
|
||
| print('Training epoch %d, cost ' % epoch, numpy.mean(c)) | ||
| print('Training epoch %d, cost ' % epoch, numpy.mean(c, dtype='float64')) |
There was a problem hiding this comment.
Do you know what is going on about numpy.mean? if c are float16 output I checked that numpy output float16. In my tests, it seem the accumulator is in float16 or something like that. Do you know?
We would need to document that in Theano about float16. At least in this issue: Theano/Theano#2908. I let you modify it, in case you can add more information.
Should we special case float16 and make Theano always return at least float32 to help prevent that type of problems?
There was a problem hiding this comment.
Yes the accumulator is float16 internally and overflows.
Do we have a page about float16 gotchas in Theano. This is the only place where I would see this sort of information.
I would oppose special-casing outputs in Theano, because the problem is easily resolved by the user and very visible.
|
I added the comment to the issue about float16: On Fri, Oct 28, 2016 at 6:34 PM, abergeron [email protected] wrote:
|
|
Is this ready to merge? |
No description provided.