INDEX
Explanations
proper nouns related to different individuals
the structure and presence of the letter 'p' in various contexts
New Auto-Interp
Negative Logits
ledged
-0.81
jriwal
-0.74
ascus
-0.69
arest
-0.69
pasture
-0.66
clinch
-0.65
alore
-0.65
lished
-0.65
sterdam
-0.62
uilt
-0.62
POSITIVE LOGITS
INAL
0.72
ãĤ·ãĥ£
0.67
kov
0.65
¯¯
0.64
Ń·
0.63
IAL
0.62
berger
0.60
èĢħ
0.60
Luther
0.60
Patton
0.59
Activations Density 0.076%