INDEX
Explanations
phrases indicating urgency or the need for action, particularly in a context of safety or legal matters
New Auto-Interp
Negative Logits
asar
-0.20
rego
-0.19
aji
-0.15
ij
-0.15
ceased
-0.15
affen
-0.14
gree
-0.14
åģ
-0.14
eden
-0.14
.node
-0.14
POSITIVE LOGITS
Inf
0.15
TEMPLATE
0.15
inf
0.15
.RunWith
0.15
Unc
0.15
ãĤ¯ãĤ·ãĥ§ãĥ³
0.15
zilla
0.15
indiscrim
0.14
Flag
0.14
\/\/
0.14
Activations Density 0.003%