INDEX
Explanations
references to specific identities or categories, particularly those related to human characteristics and societal constructs
New Auto-Interp
Negative Logits
Sexo
-0.17
ilon
-0.16
scrut
-0.14
.setParent
-0.14
-0.14
kili
-0.14
Parent
-0.13
364
-0.13
woord
-0.13
NullOrEmpty
-0.13
POSITIVE LOGITS
-like
0.27
-esque
0.23
-style
0.23
-wide
0.23
wide
0.22
-era
0.22
like
0.22
sized
0.21
-sized
0.20
-type
0.19
Activations Density 0.024%