INDEX
Explanations
words related to decrease or reduced
phrases related to decreases in various metrics or conditions
New Auto-Interp
Negative Logits
ervatives
-0.68
ansas
-0.68
dor
-0.66
wered
-0.65
Found
-0.65
Bio
-0.65
RA
-0.64
Reviewed
-0.64
ervative
-0.64
rs
-0.63
POSITIVE LOGITS
cember
0.78
regress
0.73
proport
0.73
friction
0.71
imar
0.70
inhib
0.69
ately
0.69
rences
0.68
CTR
0.68
sidx
0.68
Activations Density 0.031%