INDEX
Explanations
threatening statements or commands
urgent requests or pleas for help
New Auto-Interp
Negative Logits
agonists
-0.66
pmwiki
-0.61
strangely
-0.59
Grimoire
-0.58
odcast
-0.58
Canaver
-0.57
Hearthstone
-0.56
prisingly
-0.56
Hels
-0.56
oddly
-0.56
POSITIVE LOGITS
sic
1.10
'"
1.10
â̦"
1.06
!'"
0.99
..."
0.99
.'"
0.96
}"
0.88
'."
0.86
!"
0.85
,'"
0.84
Activations Density 0.552%