INDEX
Explanations
phrases related to political ideologies and technological advancements
New Auto-Interp
Negative Logits
atorium
-0.79
same
-0.65
iversary
-0.63
versely
-0.63
REDACTED
-0.62
oux
-0.62
fingerprint
-0.61
ificantly
-0.59
aeper
-0.58
liest
-0.58
POSITIVE LOGITS
hips
1.15
paces
1.13
mith
1.09
manship
1.06
abound
1.05
uits
1.05
poons
1.05
hip
1.00
pace
0.96
hooting
0.94
Activations Density 0.500%