INDEX
Negative Logits
u
0.41
ad
0.41
of
0.39
to
0.37
werd
0.37
is
0.35
to
0.35
á
0.35
uft
0.35
um
0.34
POSITIVE LOGITS
Ы
0.29
N
0.28
Y
0.28
Ink
0.26
↵
0.25
Castile
0.25
.
0.23
-
0.23
U
0.23
Gotham
0.23
Activations Density 0.055%
u
ad
of
to
werd
is
to
á
uft
um
Ы
N
Y
Ink
↵
Castile
.
-
U
Gotham