INDEX
Explanations
references to numerical values or calculations
New Auto-Interp
Negative Logits
Roskov
-1.49
myſelf
-1.44
Efq
-1.40
ſelf
-1.33
itſelf
-1.33
LookAnd
-1.26
―――――
-1.26
raiſ
-1.24
Jefus
-1.23
themſelves
-1.21
POSITIVE LOGITS
<eos>
0.70
↵↵
0.68
0.66
.
0.62
(
0.61
I
0.60
<strong>
0.60
x
0.60
.
0.59
as
0.58
Activations Density 0.314%