INDEX
Explanations
directives to people or commands
New Auto-Interp
Negative Logits
////////////////////////////////
-0.64
abiding
-0.61
traumatic
-0.60
Kop
-0.60
laughter
-0.59
quickShipAvailable
-0.59
genesis
-0.58
paren
-0.58
Got
-0.58
matched
-0.57
POSITIVE LOGITS
participate
1.29
perform
1.21
submit
1.21
undertake
1.21
behave
1.21
join
1.19
adhere
1.17
undergo
1.15
donate
1.15
obey
1.14
Activations Density 0.846%