INDEX
Explanations
phrases that express demands, calls to action, or requests for changes in policy or regulation
New Auto-Interp
Negative Logits
RegExp
-0.17
hausen
-0.15
CanBe
-0.15
uz
-0.14
appen
-0.14
isel
-0.14
wald
-0.14
rtl
-0.14
ramer
-0.13
InstanceState
-0.13
POSITIVE LOGITS
immediate
0.29
action
0.29
.action
0.26
Immediate
0.23
greater
0.23
ACTION
0.23
action
0.22
-action
0.21
Action
0.21
calm
0.20
Activations Density 0.088%