INDEX
Explanations
actions or verbs related to change and improvement
New Auto-Interp
Negative Logits
isma
-0.17
muz
-0.16
MAP
-0.15
ebe
-0.14
nett
-0.14
rum
-0.14
gis
-0.14
aña
-0.14
rowse
-0.13
ÑģÑĤва
-0.13
POSITIVE LOGITS
elay
0.15
adies
0.14
abel
0.14
inth
0.14
intentional
0.14
ละ
0.14
odia
0.14
ĮĢ
0.14
eterminate
0.13
addCriterion
0.13
Activations Density 0.041%