INDEX
Explanations
names of individuals
proper nouns, specifically names of individuals
New Auto-Interp
Negative Logits
ModLoader
-0.75
DonaldTrump
-0.59
retina
-0.59
Perception
-0.55
Jose
-0.52
advertisers
-0.52
grat
-0.52
muc
-0.50
ILA
-0.50
LEASE
-0.50
POSITIVE LOGITS
ovich
0.76
sson
0.76
quez
0.76
monds
0.71
plin
0.71
ensen
0.70
uez
0.69
aka
0.68
kson
0.68
zinski
0.68
Activations Density 0.304%