INDEX
Explanations
words related to the improvement or deterioration of situations
phrases indicating worsening or improving conditions
New Auto-Interp
Negative Logits
heid
-0.74
ellow
-0.73
entirety
-0.70
WB
-0.65
rans
-0.64
onement
-0.62
apple
-0.62
inance
-0.61
zhen
-0.60
vell
-0.60
POSITIVE LOGITS
progressively
0.87
veter
0.86
nearer
0.79
acquainted
0.77
noticed
0.73
sidx
0.73
traction
0.72
gradually
0.71
creeps
0.69
accustomed
0.69
Activations Density 0.105%