INDEX
Explanations
names of people or characters
references to specific individuals and a theme of apathy
New Auto-Interp
Negative Logits
ieri
-0.72
amins
-0.71
skelet
-0.70
nikov
-0.69
Assembly
-0.65
ARM
-0.65
oter
-0.65
Citiz
-0.64
assic
-0.63
synagogue
-0.63
POSITIVE LOGITS
ysis
0.79
hammad
0.76
actionGroup
0.75
Pg
0.72
fw
0.69
vt
0.69
milo
0.66
ei
0.66
fp
0.66
cules
0.65
Activations Density 0.031%