INDEX
Explanations
gender, languages, policies
New Auto-Interp
Negative Logits
kiyor
0.43
قاف
0.41
achievement
0.40
یثیت
0.40
MVCProject
0.40
béco
0.39
genealogical
0.39
وحتى
0.39
حدث
0.38
']))->
0.38
POSITIVE LOGITS
rg
0.41
ir
0.40
RR
0.39
ri
0.39
RY
0.38
RG
0.38
=
0.37
ilación
0.36
ring
0.36
Figures
0.36
Activations Density 0.090%