INDEX
Explanations
sentences expressing personal opinions or critiques about films
New Auto-Interp
Negative Logits
Compatible
-0.16
appe
-0.15
UNK
-0.15
åŀ
-0.15
unk
-0.15
andas
-0.14
woord
-0.14
ansi
-0.14
öh
-0.14
ittest
-0.14
POSITIVE LOGITS
acher
0.17
.cx
0.17
-await
0.15
zem
0.14
chw
0.14
воÑĢ
0.14
synchron
0.14
uling
0.14
achat
0.14
LAT
0.14
Activations Density 0.094%