INDEX
Explanations
political affiliations and opinions
commas and punctuation marks
New Auto-Interp
Negative Logits
¬¼
-0.87
ãĥ¯ãĥ³
-0.78
²¾
-0.75
=-=-=-=-
-0.70
tnc
-0.67
ãĤ©
-0.66
atorium
-0.65
Ĥİ
-0.65
DragonMagazine
-0.64
gian
-0.64
POSITIVE LOGITS
compared
1.31
whereas
1.16
versus
1.10
according
1.06
while
0.97
suggesting
0.95
indicating
0.93
contrasted
0.90
implying
0.85
respectively
0.85
Activations Density 0.154%