INDEX
Explanations
numbers and characters in a specific format
numerical values and terms related to health conditions or regulations
New Auto-Interp
Negative Logits
nesota
-0.90
NetMessage
-0.88
isters
-0.80
icter
-0.80
perty
-0.79
mits
-0.79
ippi
-0.76
ysical
-0.75
oning
-0.75
istine
-0.72
POSITIVE LOGITS
er
0.75
ness
0.74
ty
0.72
ative
0.72
ribution
0.71
ãĤª
0.70
rition
0.70
atively
0.68
culosis
0.68
rage
0.68
Activations Density 0.043%