This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Appendix

Examples
Premise: a group of people examine a boat with an orange flag that is sitting on sand next to a body of water.
Original hypothesis: people sit on a beach to tan. Prediction: \langleNeutral\rangle
Genetic hypothesis: people sit on a swimming to tan. Prediction: \langleContradiction\rangle
b-MHA hypothesis: people sit on a beach and tan. Prediction: \langleContradiction\rangle
w-MHA hypothesis: people sit on a beach and tan. Prediction: \langleContradiction\rangle
Premise: a woman lying in the grass in the park is wearing a red top and black capri pants and is barefoot.
Original hypothesis: a woman is sitting on a park bench wearing sandals. Prediction: \langleContradiction\rangle
Genetic hypothesis: a woman is seated on a park bench wearing footwear. Prediction: \langleEntailment\rangle
b-MHA hypothesis: a woman is sitting on a park bench wearing it. Prediction: \langleEntailment\rangle
w-MHA hypothesis: a woman is sitting on a park bench wearing it. Prediction: \langleEntailment\rangle
Premise: a man alone crosscountry skis in the wilderness while wearing a huge backpack.
Original hypothesis: a man skis in the wilderness while it’s snowing Prediction: \langleNeutral\rangle
Genetic hypothesis: a man snowboarding in the wilderness while it’s snowing Prediction: \langleContradiction\rangle
b-MHA hypothesis: a man skis in the world while it’s snowing Prediction: \langleContradiction\rangle
w-MHA hypothesis: a man skis in the wilderness and it’s snowing Prediction: \langleContradiction\rangle
Premise: a boy kneeling on a skateboard riding down the street
Original hypothesis: a boy standing upright on a skateboard. Prediction: \langleContradiction\rangle
Genetic hypothesis: a boy permanent upright on a skateboard. Prediction: \langleEntailment\rangle
b-MHA hypothesis: a boy is out on a skateboard . Prediction: \langleEntailment\rangle
w-MHA hypothesis: a boy was upright on a skateboard. Prediction: \langleEntailment\rangle
Premise: three men are sitting on a beach dressed in orange with refuse carts in front of them.
Original hypothesis: empty trash cans are sitting on a beach. Prediction: \langleContradiction\rangle
Genetic hypothesis: empties trash cans are sitting on a beach.. Prediction: \langleEntailment\rangle
b-MHA hypothesis: the trash cans are sitting in a beach. Prediction: \langleEntailment\rangle
w-MHA hypothesis: the trash cans are sitting on a beach. Prediction: \langleEntailment\rangle
Premise: hikers walk along some tough terrain.
Original hypothesis: hiking pace along rough terrain. Prediction: \langleEntailment\rangle
Genetic hypothesis: hiking pace along rough terra. Prediction: \langleNeutral\rangle
b-MHA hypothesis: hiking is in rough terrain. Prediction: \langleNeutral\rangle
w-MHA hypothesis: the pace along rough terrain. Prediction: \langleNeutral\rangle
Premise: our people walking beside each other down a street, one of the men is turned around looking toward the camera.
Original hypothesis: a group of friends are headed to wendys Prediction: \langleNeutral\rangle
Genetic hypothesis: a groups of boyfriends are guided to wendys Prediction: \langleContradiction\rangle
b-MHA hypothesis: a group of women are expected to wendys Prediction: \langleContradiction\rangle
w-MHA hypothesis: a number of people are going to wendys Prediction: \langleContradiction\rangle
Premise: a man in a green shirt hovers above the ground in the laundry room.
Original hypothesis: the man appears to be suspended in midair. Prediction: \langleEntailment\rangle
Genetic hypothesis: the man emerge to be suspended in midair. Prediction: \langleNeutral\rangle
b-MHA hypothesis: the man appears to be suspended in 2007. Prediction: \langleNeutral\rangle
w-MHA hypothesis: the man is to be suspended in midair. Prediction: \langleNeutral\rangle
Table 1: Adversarial examples generated on SNLI.

Appendix A Hyper-parameters

The hyper-parameters of the MHA model, the genetic baseline model, and the victim models are listed as follow.

MHA.

We set the hyper-parameters of MHA to pr=0.5p_{r}=0.5, pi=0.25p_{i}=0.25, pd=0.25p_{d}=0.25. Constraints on LM(x)LM(x) and C(y~|x)C(\tilde{y}|x) is performed – if LM(x)<tLMLM(x)LM(x^{\prime})<t_{LM}\cdot LM(x) or C(y~|x)<tCC(y~|x)C(\tilde{y}|x^{\prime})<t_{C}\cdot C(\tilde{y}|x), the proposal is rejected directly. Such trick ensures that we do not loss sentence fluency or target probability rapidly. tLMt_{LM} and tCt_{C} are set to 0.8 and 0.9 in our experiments. Also, any operation on sentimental words (eg. “great”) or negation words (eg. “not”) are forbidden in IMDB experiments. SentiWordNet (esuli2006sentiwordnet; baccianella2010sentiwordnet) are applied to recognize the sentimental words.

The language models in MHA includes a forward and a backward 2-layer LSTM models with 300 units trained on subset of the One-Billion-Word Corpus (chelba2013one). We randomly select 5M sentences from the corpus for LM training. The vocabulary size is 50,000. The two LSTMs employ independent embedding matrices with the same word2vec initialization.

Genetic baseline.

The hyper-parameter settings of the genetic model remain the same as in the paper (alzantot2018generating).

Victim models.

The LSTMs in the victim models have 128 units. The bi-LSTM for IMDB has a vocabulary size of 10,000, while the two LSTMs in the BiDAF model share the same vocabulary size of 35,000. The embedding matrices are pre-trained by word2vec. In addition, the embedding matrix of the bi-LSTM model for IMDB is fixed during training to avoid overfitting. All classifiers in our experiment reach 99% accuracy on the training set.

Appendix B Adversarial Examples

Some adversarial examples are listed in Table 1. The genetic replacement considers only the current word itself, regardless of its context, and results in an unfluent sentence, while MHA performs replacement with the guide of LM, and the sentence is fluent.

Empirically, MHA is allowed to operates all types of words, including the prepositions, the pronouns, and the punctuations, etc., where these changes are minor. While the genetic approach only replace the verbs, the nouns, the adjectives and the adverbs. An advantage of operating the prepositions, the punctuations, etc. is that human beings usually do not pay much attention to them. Human begins can hardly recognize the adversarial examples generated by these operations.