INDEX
    Explanations

    references to familial relationships and connections

    New Auto-Interp
    Negative Logits
    2
    -0.16
    ell
    -0.15
    dem
    -0.15
    mann
    -0.15
    vy
    -0.15
    -a
    -0.14
    emy
    -0.14
    -with
    -0.14
     Dem
    -0.14
    em
    -0.14
    POSITIVE LOGITS
    nier
    0.17
    ño
    0.17
    lico
    0.16
    nego
    0.16
    nero
    0.15
    bersome
    0.15
    нем
    0.15
    عÙĦÙĪÙħ
    0.15
    à¥įण
    0.15
    ÅĦ
    0.15
    Act Density 0.035%

    No Known Activations