INDEX
Explanations
clauses that express opinions or concerns related to political issues
New Auto-Interp
Negative Logits
she
-1.26
his
-1.20
this
-1.20
her
-1.20
our
-1.17
then
-1.09
here
-1.07
now
-1.07
these
-1.06
their
-1.04
POSITIVE LOGITS
The
1.12
This
1.08
There
1.05
It
1.05
They
0.99
These
0.97
If
0.95
That
0.94
However
0.90
When
0.89
Activations Density 2.476%