INDEX
Explanations
political statements or opinions
the specific character "â̦"
New Auto-Interp
Negative Logits
favour
-0.87
\":
-0.73
sails
-0.69
ages
-0.68
intern
-0.68
iking
-0.65
untouched
-0.65
becoming
-0.65
pharmacy
-0.65
informing
-0.64
POSITIVE LOGITS
However
0.93
ONSORED
0.89
Also
0.89
Therefore
0.83
They
0.80
cknowled
0.80
Different
0.80
There
0.79
See
0.79
Its
0.79
Activations Density 0.104%