INDEX
Explanations
pronouns for people
references to individuals or entities being discussed
New Auto-Interp
Negative Logits
pad
-0.72
termin
-0.71
Loading
-0.71
adder
-0.70
âĨ
-0.70
roads
-0.70
atory
-0.66
issue
-0.64
reach
-0.63
rid
-0.63
POSITIVE LOGITS
soever
1.97
awei
0.85
selves
0.81
izens
0.78
osponsors
0.77
irlf
0.75
oun
0.73
owship
0.73
coh
0.73
aji
0.71
Activations Density 0.019%