INDEX
Explanations
phrases indicating inclusion or mention of multiple entities or facts
phrases indicative of membership or association in a group or category
New Auto-Interp
Negative Logits
([
-0.74
Period
-0.74
bane
-0.74
ynthesis
-0.70
hiro
-0.68
ensis
-0.67
cycle
-0.65
arden
-0.65
mop
-0.65
ategory
-0.65
POSITIVE LOGITS
casualties
1.03
targets
1.02
beneficiaries
1.01
findings
0.99
recipients
0.98
surprises
0.95
topics
0.94
dozen
0.89
plaintiffs
0.89
revelations
0.89
Activations Density 0.182%