INDEX
Explanations
phrases related to political discussions and actions
punctuation, particularly commas and quotation marks
New Auto-Interp
Negative Logits
ratulations
-0.64
çīĪ
-0.62
ixt
-0.62
ominated
-0.59
aimon
-0.56
robat
-0.56
rament
-0.56
rius
-0.55
seiz
-0.55
!'
-0.55
POSITIVE LOGITS
noting
1.15
citing
1.02
adding
1.02
implying
1.00
namely
0.99
although
0.96
stressing
0.93
including
0.92
whereas
0.91
including
0.91
Activations Density 0.463%