INDEX
Negative Logits
being
-0.91
(
-0.65
,
-0.61
being
-0.60
.
-0.54
BEING
-0.53
and
-0.52
is
-0.52
Being
-0.49
Being
-0.48
POSITIVE LOGITS
raiſ
1.51
Houſe
1.49
uſed
1.48
houſe
1.45
myſelf
1.42
Diſ
1.41
itſelf
1.41
ſtate
1.39
Monfieur
1.39
ſever
1.38
Activations Density 0.027%