INDEX
    Explanations

    phrases related to transitions or movement between states or entities

    New Auto-Interp
    Negative Logits
    se
    -0.19
    ui
    -0.17
    attern
    -0.15
    ro
    -0.15
    omatic
    -0.14
    phin
    -0.14
    ko
    -0.14
    ader
    -0.14
    fusion
    -0.14
    kö
    -0.14
    POSITIVE LOGITS
     one
    0.37
     satu
    0.28
    ä¸Ģç§į
    0.24
    _one
    0.24
     одного
    0.23
    one
    0.23
     одне
    0.22
    -one
    0.22
     person
    0.21
     однÑĥ
    0.21
    Act Density 0.065%

    No Known Activations