INDEX
Explanations
verbs related to making requests or giving instructions
actions that involve requests or instructions
New Auto-Interp
Negative Logits
requires
-0.79
Appears
-0.71
Prelude
-0.66
abiding
-0.64
productive
-0.64
lines
-0.64
tions
-0.63
etheless
-0.63
pite
-0.61
ults
-0.60
POSITIVE LOGITS
participate
1.07
remove
0.97
reconsider
0.97
postpone
0.94
partake
0.92
give
0.92
keep
0.91
join
0.91
preserve
0.89
undertake
0.89
Activations Density 0.088%