INDEX
Explanations
quotes or reported statements
New Auto-Interp
Negative Logits
Himself
-0.97
artist
-0.70
amic
-0.68
OTUS
-0.67
Batman
-0.66
Insert
-0.64
ammy
-0.63
himself
-0.63
uper
-0.61
thia
-0.61
POSITIVE LOGITS
they
0.84
alike
0.73
it
0.62
theirs
0.61
goodbye
0.61
there
0.60
selves
0.60
privately
0.60
Rising
0.60
delays
0.59
Activations Density 0.133%