INDEX
Explanations
requests or instructions involving taking action
conjunctions and connectors that create a sense of continuation or linking between ideas
New Auto-Interp
Negative Logits
arious
-0.69
Originally
-0.68
][
-0.66
Iraq
-0.66
Originally
-0.65
ļéĨĴ
-0.65
responsible
-0.64
anders
-0.63
ensive
-0.63
agen
-0.63
POSITIVE LOGITS
reap
1.15
enjoy
1.14
then
1.08
preferably
1.06
THEN
1.06
bask
1.01
beware
1.01
proceed
1.01
apply
1.00
vo
1.00
Activations Density 0.258%