INDEX
Explanations
medical studies or advice
archaic or old-fashioned terms
New Auto-Interp
Negative Logits
-0.79
.
-0.78
,
-0.77
↵
-0.72
↵↵
-0.71
<eos>
-0.69
(
-0.69
-
-0.68
to
-0.66
upon
-0.63
POSITIVE LOGITS
Monfieur
1.33
myſelf
1.31
ainfi
1.30
auroit
1.30
purpoſe
1.28
feroit
1.27
avoient
1.25
Efq
1.23
auffi
1.23
Theſe
1.23
Activations Density 1.480%