INDEX
Explanations
references to social or political movements
New Auto-Interp
Negative Logits
tru
-0.16
ushman
-0.16
олÑĮно
-0.16
punk
-0.15
LinkId
-0.14
gom
-0.14
ÑĦÑĸк
-0.14
stru
-0.13
topl
-0.13
Fuller
-0.13
POSITIVE LOGITS
ature
0.15
ideon
0.15
036
0.15
éĸ
0.14
idual
0.14
oes
0.14
acz
0.14
볬
0.14
aved
0.14
_MIC
0.14
Activations Density 0.005%