INDEX
Explanations
negative critiques regarding performances in films and other media
New Auto-Interp
Negative Logits
reasonably
-0.17
gle
-0.15
824
-0.15
.dex
-0.15
uled
-0.15
heet
-0.14
Bere
-0.14
uring
-0.13
082
-0.13
عÙĪ
-0.13
POSITIVE LOGITS
beyond
0.16
every
0.15
indeed
0.15
insula
0.14
zyst
0.14
inus
0.14
çĶļèĩ³
0.14
acom
0.14
remen
0.14
.MSG
0.14
Activations Density 0.348%