INDEX
Explanations
references to a person named Ali, particularly with different variations of the name and contexts
mentions of the name "Ali."
New Auto-Interp
Negative Logits
ablishment
-0.83
lace
-0.79
sburgh
-0.79
sylvania
-0.77
eenth
-0.76
lopp
-0.74
umbn
-0.73
mble
-0.73
chie
-0.72
namese
-0.72
POSITIVE LOGITS
Ali
0.95
Express
0.89
Jinn
0.88
orescence
0.88
ases
0.86
osa
0.85
Ali
0.84
uth
0.83
otti
0.80
onda
0.79
Activations Density 0.004%