INDEX
Explanations
specific terms related to medical conditions and technical jargon
New Auto-Interp
Negative Logits
m
-0.87
ness
-0.75
s
-0.71
ly
-0.70
ms
-0.69
baix
-0.68
r
-0.68
ting
-0.66
ling
-0.65
us
-0.65
POSITIVE LOGITS
auso
1.04
purpoſe
0.99
Dooley
0.99
Quo
0.97
eo
0.96
Malo
0.96
Rollo
0.94
Mojo
0.94
Ando
0.93
Puro
0.93
Activations Density 0.844%