INDEX
Explanations
phrases related to policy and politics
patterns of disagreement or contention in discussions
New Auto-Interp
Negative Logits
abandon
-0.71
boro
-0.68
herself
-0.63
comple
-0.63
swall
-0.62
iki
-0.61
lifes
-0.61
odon
-0.60
fulness
-0.59
asleep
-0.59
POSITIVE LOGITS
ccording
0.90
Until
0.90
Scroll
0.89
Unless
0.87
Ur
0.85
Unlike
0.84
Even
0.84
Indeed
0.84
âĢ¢âĢ¢
0.83
Asked
0.82
Activations Density 0.593%