INDEX
Explanations
the verb "blame" followed by a target for attribution
key phrases and statements indicating strong opinions or significant events
New Auto-Interp
Negative Logits
orsi
-0.70
demonstrating
-0.64
thanking
-0.64
referring
-0.63
Citation
-0.63
telling
-0.63
(£
-0.61
/$
-0.60
reprint
-0.60
disclosing
-0.60
POSITIVE LOGITS
endas
0.74
ictionary
0.73
thood
0.71
border
0.70
iped
0.70
DIT
0.68
aughs
0.67
ãĥīãĥ©
0.65
hetics
0.64
aband
0.64
Activations Density 1.656%