Mini-assignment3

20180036 권혁태

Part 1. Implement LSTM model in word-level language modelling

Task1. Implement LSTM network and LSTMCell module

MyLSTMCell

MyLSTM

Result

Task2. Implement LSTM network with attention mechanism

MyLSTM_att

Result

Result with test sample

Task 2-2. Describe the advantages of using the attention mechanmism in the above task

First of all, we can reflect contextual information on model. While all former information from beginning will be compressed In the last hidden state, atten mechanism take attention into correlation the current hidden state with the whole input’s hidden state. Therefore, in the view of context, we can enable to perceive whole context by reflecting how helpful of which hidden states are to get the current state’s output.

To the next, we can enhance the performance for long sentence. As the helpful information to draw output might be taking in the early state, this information might be diluted in the last state. Therefore looking at the states drawn by the former cells, we can reflect the inputs in the early stage on the hidden state.

Part 2. CycleGAN implementation using PyTorch

Residual_Block

Train_D_ms

Train_D_sm

Train_G_msm

Train_G_sms

Results

MNIST to SVHN

SVHN to MNIST

If the discriminator classifies only two classes as in the conventional GAN what problem will arise.

Originally, just generating is the goal, so determining real of fake is sufficient. This cause mode collapse, because basically the only goal of discriminator is to judge real or fake, then the generator can fool the discriminator by resulting in just one mode. For example, with MNIST, GAN just generate only one number.

However, cycleGAN uses multiclass classification, which is more strict rule. So generator have to learn various mode to fool discriminator. The discriminator generate loss for miss-classification, there is meaningful gradient changes in generator.

Moreove, in cycleGAN generates images based on generated images which it generate and calculate loss with originals, mode collapse issue becomes critical more. If the mode collapse issue arises in cycleGAN, although the loss on cycle consistency are getting bigger, the generator only learn one mode, there is no way to decrease that loss.

Therefore, with binary classification, by the mode collapse, discriminator and generator in cycleGAN will oscillate, not converge.