INDEX
Explanations
historical figures and their achievements
New Auto-Interp
Negative Logits
نا
0.37
ش
0.35
ности
0.34
ब
0.33
ج
0.32
س
0.32
τ
0.30
وح
0.30
as
0.29
ब्र
0.29
POSITIVE LOGITS
be
0.35
stesso
0.35
}
0.34
was
0.32
'
0.31
an
0.28
I
0.28
Jr
0.28
Seite
0.28
protégé
0.28
Activations Density 0.203%