INDEX
Explanations
personal pronouns and verbs indicating agency or existence
New Auto-Interp
Negative Logits
Gegenteil
-0.33
bocetos
-0.33
Wochenende
-0.32
AssemblyCulture
-0.32
defaultstate
-0.31
Einwilligung
-0.31
InteropServices
-0.30
Seguridad
-0.30
pungkasnya
-0.30
Saluti
-0.30
POSITIVE LOGITS
zuſammen
0.80
zwiſchen
0.77
<unused79>
0.76
<unused28>
0.76
<unused8>
0.76
[@BOS@]
0.76
<unused14>
0.75
<unused23>
0.75
<unused3>
0.75
<pad>
0.75
Activations Density 0.072%