INDEX
Explanations
the word "consider."
directives or prompts to reflect on actions or decisions
New Auto-Interp
Negative Logits
Els
-0.76
phys
-0.74
aqu
-0.71
vous
-0.70
zona
-0.70
oiler
-0.69
inos
-0.67
WARN
-0.67
PATH
-0.66
IER
-0.64
POSITIVE LOGITS
whether
1.10
donating
1.03
abandoning
0.94
oneself
0.90
aloud
0.89
suing
0.88
joining
0.88
alternatives
0.87
withdrawing
0.86
yourselves
0.86
Activations Density 0.044%