INDEX
Explanations
descriptions of physical actions or commands given to others
actions related to threats and violence
New Auto-Interp
Negative Logits
pmwiki
-0.83
egreg
-0.78
[/
-0.73
anecd
-0.72
notably
-0.70
rhet
-0.69
incent
-0.68
iosyncr
-0.68
ensibly
-0.68
quasi
-0.67
POSITIVE LOGITS
someday
0.86
hers
0.82
supper
0.80
him
0.80
daddy
0.76
Daddy
0.76
pills
0.73
his
0.71
Allah
0.71
downstairs
0.69
Activations Density 0.979%