INDEX
Explanations
emotionally charged verbs related to surprise or shock
words and expressions indicating strong emotional reactions
New Auto-Interp
Negative Logits
ascript
-0.72
slips
-0.71
itions
-0.66
indexed
-0.62
aneous
-0.61
pointer
-0.59
polymorph
-0.58
organizing
-0.57
ãĥ¼ãĥ³
-0.56
arsen
-0.56
POSITIVE LOGITS
us
0.90
me
0.85
him
0.85
everybody
0.78
them
0.75
everyone
0.74
anybody
0.70
onlook
0.69
Beir
0.69
HOU
0.66
Activations Density 0.162%