INDEX
Explanations
the word "In" alone on a line, which is a quirk of the data format
New Auto-Interp
Negative Logits
myſelf
-1.97
itſelf
-1.96
Efq
-1.94
Monfieur
-1.92
Jefus
-1.80
Theſe
-1.77
pleaſure
-1.74
becauſe
-1.73
Inſ
-1.70
purpoſe
-1.68
POSITIVE LOGITS
in
2.52
on
1.19
dalam
1.08
at
1.05
,
1.02
is
0.97
for
0.96
в
0.94
to
0.93
as
0.91
Activations Density 1.564%