INDEX
Explanations
descriptive attributes related to physical appearance and characteristics
New Auto-Interp
Negative Logits
cete
-0.16
ÑĸÑĪ
-0.15
gia
-0.15
033
-0.14
038
-0.14
ammer
-0.14
azz
-0.14
amac
-0.13
usses
-0.13
ulet
-0.13
POSITIVE LOGITS
enth
0.18
onne
0.15
AGR
0.15
ynamo
0.14
_android
0.14
trait
0.14
CCR
0.14
buie
0.13
eration
0.13
cura
0.13
Activations Density 0.005%