INDEX
Explanations
phrases indicating worsening conditions or negative situations
New Auto-Interp
Negative Logits
heny
-0.77
ogens
-0.75
DX
-0.73
ebin
-0.73
ingu
-0.71
ocene
-0.69
odcast
-0.68
ogenous
-0.68
otal
-0.67
tnc
-0.66
POSITIVE LOGITS
farther
0.79
faster
0.78
hotter
0.76
clearer
0.76
heavier
0.75
prett
0.73
louder
0.69
thicker
0.69
harder
0.68
quicker
0.67
Activations Density 0.031%