INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Cur
    -0.08
    llo
    -0.08
     Crane
    -0.07
     Есть
    -0.07
     magnet
    -0.07
     Mesa
    -0.07
     Henry
    -0.07
     Percy
    -0.07
     plekken
    -0.07
     Ie
    -0.07
    POSITIVE LOGITS
    Membership
    0.10
    Members
    0.09
    _member
    0.09
    genoten
    0.09
    mates
    0.09
    成员
    0.09
     Members
    0.09
    mate
    0.09
     Membership
    0.08
    /group
    0.08
    Act Density 0.004%

    No Known Activations