INDEX
Explanations
phrases related to shocking or morally outrageous actions
expressions of shock or disbelief regarding human actions or decisions
New Auto-Interp
Negative Logits
/
-0.72
taboola
-0.69
Newsletter
-0.64
respectively
-0.63
Ghosts
-0.60
Regist
-0.59
Areas
-0.58
Pigs
-0.58
Trog
-0.57
CrossRef
-0.57
POSITIVE LOGITS
such
1.00
?!"
0.99
!?"
0.94
seemingly
0.89
blatantly
0.87
mere
0.87
suddenly
0.86
!?
0.81
?!
0.81
without
0.76
Activations Density 0.759%