INDEX
Explanations
terms related to colonialism and its aspects
New Auto-Interp
Negative Logits
itar
-0.16
wend
-0.16
wyn
-0.15
eon
-0.15
lyn
-0.15
itarian
-0.14
itan
-0.14
olie
-0.14
mia
-0.14
itage
-0.14
POSITIVE LOGITS
-era
0.19
readcr
0.15
rif
0.15
baum
0.14
GBK
0.14
/catalog
0.14
-*-č↵
0.14
ìĭ¬
0.14
busters
0.14
cratch
0.14
Activations Density 0.040%