INDEX
Explanations
statements indicating existence or presence
New Auto-Interp
Negative Logits
infatti
-0.70
a
-0.68
an
-0.66
olyan
-0.62
It
-0.58
is
-0.56
[
-0.54
The
-0.54
as
-0.53
một
-0.52
POSITIVE LOGITS
myſelf
1.03
itſelf
1.01
leaſt
1.01
Efq
0.98
―――――
0.96
Houſe
0.93
whoſe
0.93
uſed
0.92
Reſ
0.92
raiſ
0.91
Activations Density 0.511%