INDEX
Negative Logits
arch
-0.78
repro
-0.73
adjud
-0.73
prelim
-0.71
separated
-0.71
scheduled
-0.70
favor
-0.70
listed
-0.69
powered
-0.69
departed
-0.69
POSITIVE LOGITS
We
1.57
Our
1.49
They
1.47
It
1.45
There
1.45
Everybody
1.43
Nobody
1.40
I
1.40
Everything
1.40
What
1.39
Activations Density 0.570%