INDEX
    Explanations

    phrases indicating permission or the act of allowing something or someone

    New Auto-Interp
    Negative Logits
    ryn
    -0.18
    idelberg
    -0.16
    ynam
    -0.16
    евеÑĢ
    -0.16
    .framework
    -0.15
    .Slf
    -0.14
    mie
    -0.14
    irie
    -0.14
    eden
    -0.14
    tram
    -0.14
    POSITIVE LOGITS
    ouch
    0.16
     others
    0.15
    oha
    0.15
    heim
    0.14
    OOK
    0.14
    ipl
    0.14
    unge
    0.13
    GO
    0.13
    -go
    0.13
    691
    0.13
    Act Density 0.045%

    No Known Activations