INDEX
Explanations
proper nouns or acronyms related to organizations or events
references to the National Cancer Institute or related organizations and terms
New Auto-Interp
Negative Logits
thing
-0.87
chest
-0.77
loo
-0.73
shake
-0.72
Magikarp
-0.71
mats
-0.69
tons
-0.68
mania
-0.67
Picture
-0.67
many
-0.67
POSITIVE LOGITS
urses
1.07
ODE
0.94
ategor
0.89
urrent
0.88
NC
0.85
ouri
0.84
ossus
0.82
ADE
0.80
onduct
0.76
NC
0.76
Activations Density 0.007%