INDEX
Explanations
terms related to results and their validation in scientific research
New Auto-Interp
Negative Logits
et
-0.58
↵↵
-0.55
S
-0.53
I
-0.53
to
-0.53
is
-0.52
a
-0.51
the
-0.51
ver
-0.51
,
-0.50
POSITIVE LOGITS
myſelf
1.26
itſelf
1.17
Jefus
1.16
purpoſe
1.09
་་
1.08
$_"
1.07
―――――
1.07
ſeveral
1.07
Monfieur
1.07
Theſe
1.04
Activations Density 0.425%