ERNIE 2.0: A continual pre-training framework for language understanding

article cover

ERNIE 2.0 (Enhanced Representation through kNowledge IntEgration), a new knowledge integration language representation model that aims to beat SOTA results of BERT and XLNet. While pre-training with more than just several simple tasks to grasp the co-occurrence of words or sentences for language modeling, Ernie aims to explore named entities, semantic closeness and discourse relations to obtain valuable lexical, syntactic and semantic information from training corpora. Ernie 2.0 focus on building and learning incrementally pre-training tasks through constant multi-task learning. And it brings some interesting results.

View comments.

more ...

NLP: Explaining Neural Language Modeling

article cover

Language modeling (LM) is the essential part of Natural Language Processing (NLP) tasks such as Machine Translation, Spell Correction Speech Recognition, Summarization, Question Answering, Sentiment analysis etc. Goal of the Language Model is to compute the probability of sentence considered as a word sequence. This article explains how to model the language using probability and n-grams. It also discuss the language model evaluation with use of perplexity.

View comments.

more ...

The Transformer – Attention is all you need.

article cover

Transformer - more than meets the eye! Are we there yet? Well... not really, but...
How about eliminating recurrence and convolution from transduction? Sequence modeling and transduction (e.g. language modeling, machine translation) problems solutions has been dominated by RNN (especially gated RNN) or LSTM, additionally employing the attention mechanism. Main sequence transduction models are based on RNN or CNN including encoder and decoder. The new transformer architecture is claimed however, to be more parallelizable and requiring significantly less time to train, solely focusing on attention mechanisms.

View comments.

more ...

Neural Networks Primer

article cover

When you approach a new term you often find some Wiki page, Quora answers blogs and it sometimes might take some time before you find the true ground up, clear definition with meaningful example. I will put here the most intuitive explanations of basic topics. Due to extended nature of aspects and terms that are used across NN area, in this post I will place condensed definitions and a brief explanations – just to understand the intuition of terms that are mentioned in other posts along this blog.

View comments.

more ...