INDEX
Explanations
terms related to innate or natural qualities and characteristics
New Auto-Interp
Negative Logits
ETS
-0.17
indr
-0.16
descargar
-0.15
彦
-0.15
fk
-0.15
acco
-0.14
fik
-0.14
orias
-0.13
ignant
-0.13
/th
-0.13
POSITIVE LOGITS
natural
0.33
Natural
0.32
Natural
0.31
naturally
0.28
natural
0.28
/native
0.26
doÄŁal
0.24
Naturally
0.22
/default
0.22
èĩªçĦ¶
0.22
Activations Density 0.091%