INDEX
Explanations
statements from different people
statements indicating a refusal or denial
New Auto-Interp
Negative Logits
Outside
-0.54
Moroc
-0.52
Coliseum
-0.51
ichever
-0.51
School
-0.51
romeda
-0.50
Pearce
-0.50
eday
-0.50
Pixie
-0.49
Gray
-0.49
POSITIVE LOGITS
hereby
0.77
largeDownload
0.74
nevertheless
0.71
pmwiki
0.71
nonetheless
0.65
']
0.62
surely
0.62
adv
0.61
replied
0.58
');
0.58
Activations Density 0.537%