INDEX
Explanations
phrases discussing actions or statements made by individuals
New Auto-Interp
Negative Logits
abil
-0.68
ander
-0.64
WIN
-0.64
houses
-0.64
halla
-0.63
nurture
-0.63
hips
-0.62
natureconservancy
-0.62
presided
-0.61
2019
-0.60
POSITIVE LOGITS
sure
1.04
noises
0.96
clear
0.92
headlines
0.90
assertions
0.88
mention
0.83
dispar
0.83
insin
0.82
explicit
0.80
landfall
0.80
Activations Density 0.122%