INDEX
Explanations
narrative elements related to personal stories and familial relationships
New Auto-Interp
Negative Logits
udiant
-0.16
ichick
-0.15
殿
-0.15
acea
-0.15
utr
-0.14
ÑĮко
-0.14
ادÙĬ
-0.14
opak
-0.13
'Ñı
-0.13
elage
-0.13
POSITIVE LOGITS
umin
0.14
θη
0.14
integr
0.14
itty
0.14
arehouse
0.13
ç§ij
0.13
³³ ³³
0.13
åĪļæīį
0.13
proudly
0.13
omat
0.13
Activations Density 0.098%