INDEX
Explanations
the presence of first-person plural and singular pronouns alongside specific action verbs and phrases indicating criticism or expectations related to social or political issues
New Auto-Interp
Negative Logits
Emin
-0.57
moreover
-0.57
Sigma
-0.56
bp
-0.56
rame
-0.54
Ack
-0.54
Chain
-0.54
bench
-0.53
AAA
-0.53
olate
-0.52
POSITIVE LOGITS
erest
0.69
counterparts
0.67
zek
0.65
predecessors
0.63
resent
0.61
ayn
0.60
adesh
0.60
igr
0.60
thinkable
0.59
én
0.58
Activations Density 0.132%