INDEX
Explanations
words related to proper nouns, specifically names or titles that contain 'ald'
references to specific individuals' last names
New Auto-Interp
Negative Logits
PER
-0.87
PLAY
-0.74
puter
-0.72
zzo
-0.72
senal
-0.71
PLA
-0.70
ORTS
-0.69
prime
-0.68
PROV
-0.67
kick
-0.67
POSITIVE LOGITS
sburg
1.09
orf
0.98
rums
0.95
ry
0.90
ivia
0.88
emort
0.88
oran
0.86
sson
0.85
ald
0.85
trump
0.85
Activations Density 0.015%