INDEX
Explanations
phrases indicating transformation or modification
New Auto-Interp
Negative Logits
peper
-0.54
cattle
-0.49
Zusammen
-0.48
肝
-0.47
Coul
-0.47
endian
-0.46
setof
-0.46
ого
-0.45
Kenne
-0.45
"}")
-0.45
POSITIVE LOGITS
become
0.88
deviennent
0.85
Become
0.85
diventare
0.85
值为
0.82
AnchorStyles
0.80
becomes
0.79
becoming
0.78
Become
0.78
Geplaatst
0.75
Activations Density 0.410%