INDEX
Explanations
themes related to familial and social relationships
New Auto-Interp
Negative Logits
oken
-0.15
ebi
-0.15
Moral
-0.15
subpackage
-0.14
485
-0.14
å»
-0.14
ozem
-0.14
ltk
-0.14
edy
-0.14
bor
-0.14
POSITIVE LOGITS
ÄIJT
0.17
Pers
0.16
rice
0.15
998
0.14
chair
0.14
Doyle
0.14
merc
0.14
MARK
0.14
orsi
0.14
ìĤ¼
0.14
Activations Density 0.462%