INDEX
Explanations
words related to negative outcomes or predictions
concepts related to inevitability and failure
New Auto-Interp
Negative Logits
EMA
-0.73
tein
-0.68
POL
-0.68
enne
-0.67
lean
-0.65
assisted
-0.63
CRE
-0.62
soType
-0.62
amera
-0.62
aque
-0.62
POSITIVE LOGITS
doomed
0.92
doom
0.87
releg
0.83
downfall
0.81
mete
0.78
fate
0.76
iflower
0.75
relegation
0.73
itably
0.72
itable
0.71
Activations Density 0.019%