RNN模型与NLP应用

Categorical feature processing

Step 1: Tokenization

Step 2: Build Dictionary

Step 3: Encoding

Step 4: Align Sequences

First, represent words using one-hot vectors.

Second, map the one-hot vectors to low-dimensional vectors

Total #parameter: vocabulary×embedding_dim

Total #parameter: shape(h)× [shape(h)+shape(x)]

SimpleRNN is good at short-term dependence.

SimpleRNN is bad at long-term dependence.

4 × shape(h) × [shape(h)+shape(x)]

Observation: The embedding layer contributes most of the parameters!

Step 1: Train a model on large dataset.
• Perhaps different problem.
• Perhaps different model.

Step 2: Keep only the embedding layer.

Step 3: Train new LSTM and output layers.

参考：