INDEX
    Explanations

    verbs related to change or transition

    New Auto-Interp
    Negative Logits
    oref
    -0.16
    18
    -0.15
    ools
    -0.15
    kek
    -0.14
    ừ
    -0.14
    enheim
    -0.14
    clide
    -0.14
    Åĵur
    -0.14
    isted
    -0.14
    jsx
    -0.14
    POSITIVE LOGITS
    ÃŃa
    0.37
    án
    0.35
    á
    0.35
    ÃŃan
    0.31
    emos
    0.25
    iam
    0.25
    ÃŃ
    0.24
    ás
    0.24
    ÃŃas
    0.23
    ia
    0.22
    Act Density 0.009%

    No Known Activations