INDEX
Explanations
German words and names
New Auto-Interp
Negative Logits
eers
-0.87
states
-0.72
ered
-0.71
quo
-0.70
eer
-0.68
Hots
-0.66
manship
-0.65
SHIP
-0.64
rers
-0.63
trumpet
-0.60
POSITIVE LOGITS
udge
1.22
acker
1.18
atton
1.16
anded
1.14
umpy
1.12
acket
1.12
acking
1.10
abbit
1.10
aternity
1.08
agnar
1.07
Activations Density 1.712%