INDEX
Explanations
verbs and actions related to power dynamics and social issues
New Auto-Interp
Negative Logits
udge
-0.15
anean
-0.15
uds
-0.15
icus
-0.15
ibar
-0.15
ERCHANT
-0.15
cratch
-0.14
ampus
-0.14
391
-0.14
.gs
-0.14
POSITIVE LOGITS
ê·ł
0.14
adesh
0.14
GenericType
0.14
ëĦ·
0.14
ValuePair
0.14
Äijình
0.13
owell
0.13
íĤ
0.13
Brush
0.13
↵
0.13
Activations Density 0.152%