INDEX
Explanations
references to familial relationships and connections
New Auto-Interp
Negative Logits
2
-0.16
ell
-0.15
dem
-0.15
mann
-0.15
vy
-0.15
-a
-0.14
emy
-0.14
-with
-0.14
Dem
-0.14
em
-0.14
POSITIVE LOGITS
nier
0.17
ño
0.17
lico
0.16
nego
0.16
nero
0.15
bersome
0.15
нем
0.15
عÙĦÙĪÙħ
0.15
à¥įण
0.15
ÅĦ
0.15
Activations Density 0.035%