INDEX
Explanations
phrases related to political discussions and opinions
punctuation and commas in the text
New Auto-Interp
Negative Logits
ominated
-0.65
irie
-0.65
haus
-0.64
rament
-0.62
hower
-0.61
oir
-0.60
ratulations
-0.59
iple
-0.58
robat
-0.57
sheet
-0.57
POSITIVE LOGITS
noting
1.20
citing
1.08
adding
1.02
implying
1.01
stressing
1.01
namely
0.92
including
0.90
stating
0.89
referring
0.88
although
0.86
Activations Density 0.436%