INDEX
Explanations
words related to reduction or decrease
terms related to reduction or decrease
New Auto-Interp
Negative Logits
ullivan
-0.77
rote
-0.70
otle
-0.69
=-=-=-=-=-=-=-=-
-0.68
OH
-0.68
alez
-0.68
oho
-0.67
Web
-0.66
PB
-0.66
Offline
-0.65
POSITIVE LOGITS
dimin
1.10
diminish
0.94
lessen
0.91
diminished
0.80
proport
0.80
inished
0.75
utive
0.74
Ń·
0.74
aback
0.74
wcs
0.71
Activations Density 0.009%