INDEX
Explanations
phrases indicating the beginning or transformation of movements and behaviors
New Auto-Interp
Negative Logits
abestanden
-0.51
насељу
-0.48
eyel
-0.45
HXLINE
-0.44
kife
-0.43
chande
-0.43
請繼續往下閱讀
-0.43
qtype
-0.43
separ
-0.41
Ҭ
-0.41
POSITIVE LOGITS
featureID
0.49
numerusform
0.47
PeEnEo
0.38
ursprünglich
0.38
превра
0.37
pinulongan
0.36
schein
0.36
endorong
0.35
seemingly
0.35
trajets
0.35
Activations Density 0.734%