INDEX
Explanations
phrases related to medical and biological conditions, particularly those describing processes or characteristics
New Auto-Interp
Negative Logits
Monfieur
-0.91
itſelf
-0.90
ſeveral
-0.90
Theſe
-0.89
myſelf
-0.84
whoſe
-0.82
Jefus
-0.81
་་
-0.80
Eſ
-0.80
Efq
-0.79
POSITIVE LOGITS
<bos>
0.80
.
0.54
/
0.50
,
0.50
4
0.45
3
0.45
<eos>
0.45
(
0.45
0.44
↵↵
0.44
Activations Density 49.482%