INDEX
Explanations
general expressions of routine activities and social interactions
New Auto-Interp
Negative Logits
_sym
-0.16
brero
-0.16
yll
-0.15
aret
-0.14
instead
-0.14
Dong
-0.14
Symfony
-0.14
éĿĪ
-0.14
abbo
-0.13
lash
-0.13
POSITIVE LOGITS
ãģ¥
0.16
Hopkins
0.16
iores
0.15
iani
0.15
ocking
0.15
à¥įà¤Ĺत
0.15
(æ°´
0.14
FITNESS
0.14
[src
0.14
inen
0.14
Activations Density 0.276%