INDEX
Explanations
references to specific individuals or groups that are involved in discussions around social and political issues
New Auto-Interp
Negative Logits
plist
-0.15
fd
-0.15
ulet
-0.14
roat
-0.14
anz
-0.14
afür
-0.14
Buzz
-0.14
reten
-0.14
aye
-0.14
weg
-0.14
POSITIVE LOGITS
czy
0.16
enn
0.15
croft
0.14
imed
0.14
çı
0.14
Ñģон
0.14
Encoded
0.14
.fil
0.14
esso
0.13
.ts
0.13
Activations Density 0.034%