INDEX
    Explanations

    phrases indicating a return or restoration to a previous state

    New Auto-Interp
    Negative Logits
     Various
    -0.14
    ütün
    -0.14
    ibur
    -0.14
    anine
    -0.14
    _pv
    -0.14
    zw
    -0.14
    ledon
    -0.14
    ÑĢел
    -0.13
     ÑĢазлиÑĩнÑĭÑħ
    -0.13
    onna
    -0.13
    POSITIVE LOGITS
     normal
    0.34
     basics
    0.31
    normal
    0.28
     sender
    0.25
    -normal
    0.25
     roots
    0.24
     Basics
    0.23
     fold
    0.23
    Normal
    0.23
     NORMAL
    0.23
    Act Density 0.103%

    No Known Activations