INDEX
Explanations
phrases indicating calls to action or suggestions for specific actions
occurrences of the word "call" in various forms, indicating demands or requests in the text
New Auto-Interp
Negative Logits
riks
-0.62
ILCS
-0.61
kus
-0.59
ãĥĹ
-0.57
mental
-0.56
ãĥ¼ãĥĨ
-0.56
女
-0.56
Ö¼
-0.55
é¾įå
-0.55
âĢ¢âĢ¢âĢ¢âĢ¢
-0.55
POSITIVE LOGITS
attention
1.20
for
1.00
forth
0.92
upon
0.92
ously
0.91
enance
0.87
phas
0.81
into
0.81
Attention
0.80
bullshit
0.75
Activations Density 0.054%