
BIBLIOGRAPHY
Bengio, Y., Simard, P., and Frasconi, P. (1994). Learning long-term dependencies with
gradient descent is difficult. IEEE Tr. Neural Nets. 213, 214, 267, 274, 276, 277
Bengio, Y., LeCun, Y., Nohl, C., and Burges, C. (1995). Lerec: A NN/HMM hybrid for
on-line handwriting recognition. Neural Computation, 7(6), 1289–1303. 290
Bengio, Y., Ducharme, R., and Vincent, P. (2001a). A neural probabilistic language
model. In NIPS’00, pages 932–938. MIT Press. 16
Bengio, Y., Ducharme, R., and Vincent, P. (2001b). A neural probabilistic language
model. In NIPS’2000, pages 932–938. 319, 321, 322, 332
Bengio, Y., Ducharme, R., and Vincent, P. (2001c). A neural probabilistic language
model. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, NIPS’2000 , pages
932–938. MIT Press. 433, 434
Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. (2003a). A neural probabilistic
language model. JMLR, 3, 1137–1155. 321, 325, 332
Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. (2003b). A neural probabilistic
language model. Journal of Machine Learning Research, 3, 1137–1155. 433, 434
Bengio, Y., Delalleau, O., and Le Roux, N. (2006a). The curse of highly variable functions
for local kernel machines. In NIPS’2005 . 133
Bengio, Y., Larochelle, H., and Vincent, P. (2006b). Non-local manifold Parzen windows.
In NIPS’2005 . MIT Press. 137, 431
Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. (2007). Greedy layer-wise
training of deep networks. In NIPS’2006 . 12, 16, 396, 397
Bengio, Y., Louradour, J., Collobert, R., and Weston, J. (2009). Curriculum learning. In
ICML’09 . 158
Bengio, Y., L´eonard, N., and Courville, A. (2013a). Estimating or propagating gradients
through stochastic neurons for conditional computation. arXiv:1308.3432. 332, 360
Bengio, Y., Yao, L., Alain, G., and Vincent, P. (2013b). Generalized denoising auto-
encoders as generative models. In NIPS’2013. 392, 508, 512
Bengio, Y., Courville, A., and Vincent, P. (2013c). Representation learning: A review and
new perspectives. IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI),
35(8), 1798–1828. 423, 506
Bengio, Y., Thibodeau-Laufer, E., Alain, G., and Yosinski, J. (2014a). Deep generative
stochastic networks trainable by backprop. Technical Report arXiv:1306.1091. 360
Bengio, Y., Thibodeau-Laufer, E., Alain, G., and Yosinski, J. (2014b). Deep generative
stochastic networks trainable by backprop. In ICML’2014 . 360, 509, 510, 511, 513,
514
524