INDEX
    Explanations

    elements related to relationships and familial structures

    New Auto-Interp
    Negative Logits
    apos
    -0.16
    ersh
    -0.15
    gom
    -0.15
    gren
    -0.14
    orpor
    -0.14
    enez
    -0.14
    леÑĩ
    -0.14
    iskey
    -0.14
    lus
    -0.14
     crib
    -0.14
    POSITIVE LOGITS
     what
    0.17
     studio
    0.17
     whose
    0.14
    ovel
    0.14
     ten
    0.14
     rect
    0.14
    kk
    0.14
     when
    0.14
     c
    0.14
     b
    0.13
    Act Density 0.394%

    No Known Activations