INDEX
Explanations
phrases related to invariance under transformations
New Auto-Interp
Negative Logits
"]);
-0.90
').
-0.85
########.
-0.80
مشين
-0.79
estekak
-0.77
"],
-0.77
"])
-0.75
تضيفلها
-0.75
."),
-0.73
"]).
-0.73
POSITIVE LOGITS
membres
0.57
kepolisian
0.55
demandes
0.54
eorum
0.54
supérieures
0.54
immédi
0.53
médecins
0.53
faptul
0.52
defn
0.52
chimiques
0.51
Activations Density 0.003%