INDEX
Explanations
commands or instructions emphasizing prohibition or caution
negative commands or prohibitions
New Auto-Interp
Negative Logits
ELD
-0.82
liner
-0.76
Redd
-0.75
grounds
-0.68
ilage
-0.67
Redditor
-0.66
upon
-0.65
orld
-0.65
testament
-0.62
turned
-0.61
POSITIVE LOGITS
icably
1.00
icable
0.96
exceed
0.76
theless
0.76
underestimate
0.76
confuse
0.76
hesitate
0.74
interrupt
0.74
worry
0.74
necessarily
0.72
Activations Density 0.126%