INDEX
Explanations
mentions of the name "Ali" with varying levels of relevance
mentions of the name "Ali."
New Auto-Interp
Negative Logits
eenth
-0.86
lace
-0.84
ledged
-0.83
sburgh
-0.81
namese
-0.80
neys
-0.76
sylvania
-0.76
lying
-0.75
ledge
-0.74
darn
-0.73
POSITIVE LOGITS
Express
0.98
osa
0.95
ases
0.93
Jinn
0.90
orescence
0.89
uth
0.81
ased
0.81
otti
0.80
Kham
0.79
Ali
0.78
Activations Density 0.021%