INDEX
Explanations
themes related to familial relationships and traditional roles
New Auto-Interp
Negative Logits
oga
-0.17
cores
-0.15
angan
-0.15
modal
-0.14
agar
-0.14
ated
-0.14
oder
-0.14
osis
-0.14
borg
-0.14
aro
-0.14
POSITIVE LOGITS
Ãłm
0.16
æ£
0.16
hoe
0.16
tÃŃ
0.15
typ
0.14
Barb
0.14
_nullable
0.14
nard
0.14
ãĥĮ
0.14
artment
0.14
Activations Density 0.329%