INDEX
Explanations
phrases related to authority figures or organizations
references to government agencies or affiliations
New Auto-Interp
Negative Logits
WHERE
-0.76
dos
-0.71
ahu
-0.69
achu
-0.68
vans
-0.67
chedel
-0.66
aza
-0.65
iphany
-0.65
anish
-0.64
soDeliveryDate
-0.64
POSITIVE LOGITS
base
0.76
depend
0.70
indebted
0.70
dearly
0.67
depends
0.67
sympath
0.67
unres
0.66
Emb
0.65
suff
0.65
sole
0.64
Activations Density 0.114%