INDEX
Explanations
words related to contrast or alternative choices
expressions of alternatives or contrasts
New Auto-Interp
Negative Logits
eele
-0.77
owe
-0.72
âĹ¼
-0.69
eno
-0.62
CLUD
-0.62
Influ
-0.62
çͰ
-0.61
acha
-0.61
quad
-0.61
oker
-0.60
POSITIVE LOGITS
blindly
1.02
altogether
0.98
passively
0.94
outright
0.87
anymore
0.86
solely
0.83
ourselves
0.81
entirely
0.81
mindless
0.77
oneself
0.76
Activations Density 0.256%