INDEX
Explanations
phrases related to tasks or instructions
phrases and words that emphasize personal agency or self-directed actions
New Auto-Interp
Negative Logits
inguished
-0.77
Applications
-0.73
icularly
-0.73
ertility
-0.71
Ĥª
-0.70
Enhanced
-0.68
perties
-0.67
edIn
-0.65
letal
-0.65
ustomed
-0.64
POSITIVE LOGITS
clause
1.03
button
1.02
factor
0.86
"-
0.83
drawer
0.80
'-
0.78
clauses
0.77
bandwagon
0.77
option
0.76
-
0.76
Activations Density 0.177%