INDEX
Negative Logits
unconditional
-0.09
succeeded
-0.08
Regardless
-0.08
interrupted
-0.08
sequel
-0.07
hypert
-0.07
unir
-0.07
Whatever
-0.07
nummer
-0.07
imis
-0.07
POSITIVE LOGITS
CHOOL
0.08
.scale
0.08
TED
0.08
осві
0.08
führer
0.08
tum
0.08
Schools
0.07
scale
0.07
adjective
0.07
chool
0.07
Activations Density 0.001%