INDEX
Explanations
pronouns (he, she, they) followed by verbs indicating speech or action
statements or claims made by individuals
New Auto-Interp
Negative Logits
acters
-0.69
eating
-0.66
wcsstore
-0.64
rocket
-0.63
ancial
-0.63
rame
-0.62
usha
-0.61
uitous
-0.60
Benefit
-0.60
alian
-0.59
POSITIVE LOGITS
argued
0.82
'd
0.82
reasoned
0.81
said
0.80
says
0.78
say
0.78
argues
0.76
©¶æ¥µ
0.75
argue
0.73
pointed
0.67
Activations Density 0.237%