INDEX
Negative Logits
as
0.87
alz
0.86
In
0.83
tuttavia
0.83
however
0.81
oral
0.81
alc
0.80
R
0.80
l
0.79
I
0.79
POSITIVE LOGITS
′-
1.09
-
1.07
-${1.04
־
1.01
-【
0.99
_
0.96
‐
0.96
-$\
0.93
ти
0.92
-;
0.90
Activations Density 0.003%
as
alz
In
tuttavia
however
oral
alc
R
l
I
′-
-
-${־
-【
_
‐
-$\
ти
-;