INDEX
Explanations
instances of familial relationships and social dynamics
New Auto-Interp
Negative Logits
æ£
-0.17
Regents
-0.15
forge
-0.15
orum
-0.14
è¦
-0.14
aqu
-0.14
Muham
-0.14
è²´
-0.14
quis
-0.14
cac
-0.14
POSITIVE LOGITS
ji
0.35
ji
0.34
Ji
0.32
sir
0.25
jis
0.25
JI
0.23
alias
0.22
Sir
0.22
alias
0.22
ki
0.21
Activations Density 0.277%