INDEX
Explanations
phrases indicating personal limitations or experiences
New Auto-Interp
Negative Logits
Datuak
-0.81
beginnetje
-0.74
transfieras
-0.74
kaarangay
-0.72
তথ্যসূত্র
-0.69
EndContext
-0.68
انيف
-0.66
AssemblyCulture
-0.66
Vidite
-0.65
✨:
-0.65
POSITIVE LOGITS
they
0.66
we
0.51
thenReturn
0.48
нская
0.47
there
0.46
she
0.45
trọng
0.45
│
0.44
rary
0.43
افظ
0.42
Activations Density 0.045%