INDEX
Explanations
words related to foreign language or non-English terms
New Auto-Interp
Negative Logits
juni
-0.46
AndEndTag
-0.45
Escolar
-0.43
ScopeManager
-0.42
Jungen
-0.41
httphttps
-0.41
alturas
-0.41
uxxxx
-0.41
uai
-0.40
NUKAT
-0.40
POSITIVE LOGITS
ف
1.76
ف
1.19
الف
1.09
فأ
0.90
فه
0.77
פ
0.75
F
0.75
فال
0.75
والف
0.68
getF
0.67
Activations Density 0.001%