INDEX
Explanations
the presence of the letter "A" at the beginning of phrases or sentences
New Auto-Interp
Negative Logits
encre
-0.88
principalTable
-0.81
etheless
-0.80
useAppContext
-0.80
ukone
-0.76
Aiheesta
-0.76
فريبيس
-0.75
ftagPool
-0.75
estime
-0.74
طلحات
-0.74
POSITIVE LOGITS
A
1.14
A
0.95
А
0.67
getA
0.67
E
0.67
U
0.66
G
0.66
B
0.65
M
0.63
F
0.63
Activations Density 0.208%