INDEX
Explanations
references to social and political movements
New Auto-Interp
Negative Logits
Steel
-0.15
aken
-0.15
ersed
-0.15
oor
-0.15
criptors
-0.14
icts
-0.14
steel
-0.14
Kel
-0.14
steel
-0.14
ostel
-0.14
POSITIVE LOGITS
grese
0.17
idente
0.16
insky
0.15
ény
0.15
oid
0.15
strokeLine
0.15
iscard
0.14
alen
0.14
constitution
0.14
826
0.14
Activations Density 0.009%