INDEX
Explanations
technical terms related to health and medicine
New Auto-Interp
Negative Logits
myſelf
-0.94
Monfieur
-0.89
iſt
-0.85
Efq
-0.84
་་
-0.82
Eſ
-0.81
ſeveral
-0.80
Jefus
-0.80
Majefty
-0.79
BibitemShut
-0.79
POSITIVE LOGITS
<bos>
0.74
0.56
nd
0.55
ver
0.53
ug
0.53
ud
0.53
red
0.52
inter
0.50
ke
0.50
足
0.49
Activations Density 0.414%