INDEX
Explanations
numerical data and references to studies or research findings
New Auto-Interp
Negative Logits
ÅĪ
-0.18
].[
-0.17
uesto
-0.15
à¹Ĥà¸Ļ
-0.15
orra
-0.15
arto
-0.15
ansson
-0.15
ç§
-0.14
/tos
-0.14
ãģĨãģ¡
-0.14
POSITIVE LOGITS
amel
0.15
yd
0.15
ãĤ¿ãĥ«
0.14
_authentication
0.13
صØŃ
0.13
cope
0.13
PLICIT
0.13
éĥŃ
0.13
ystate
0.13
agal
0.13
Activations Density 0.039%