INDEX
Explanations
words that emphasize literal situations or direct statements
New Auto-Interp
Negative Logits
autorytatywna
-0.73
舺
-0.67
desmotivaciones
-0.65
niosek
-0.64
étoient
-0.62
rungsseite
-0.62
müſſen
-0.61
Infór
-0.57
ähteet
-0.57
seamnă
-0.57
POSITIVE LOGITS
it
0.59
actual
0.55
One
0.53
the
0.48
The
0.48
World
0.47
Actual
0.47
Individual
0.47
It
0.47
New
0.47
Activations Density 0.538%