INDEX
Explanations
names of individuals of interest
New Auto-Interp
Negative Logits
agements
-0.74
aries
-0.73
agers
-0.73
ostic
-0.72
ishes
-0.67
rd
-0.66
amins
-0.65
ests
-0.65
ish
-0.65
aging
-0.64
POSITIVE LOGITS
llular
1.10
idon
0.95
llor
0.95
lyn
0.90
llan
0.87
lla
0.85
ptives
0.80
vich
0.80
lli
0.79
flight
0.76
Activations Density 0.045%