INDEX
Explanations
terms related to political ideologies and their critiques
New Auto-Interp
Negative Logits
e
-0.80
wdata
-0.69
abiti
-0.66
laşı
-0.62
caption
-0.60
Kla
-0.59
Fletcher
-0.59
ț
-0.58
lotes
-0.58
Kla
-0.58
POSITIVE LOGITS
itſelf
0.95
myſelf
0.94
Efq
0.88
'\\;'
0.85
becauſe
0.84
BoxDecoration
0.83
leaſt
0.83
cist
0.83
depositphotos
0.83
izm
0.82
Activations Density 0.143%