INDEX
    Explanations

    variations of the word "original" and referential words indicating modifications or changes

    New Auto-Interp
    Negative Logits
    izzas
    -0.15
     other
    -0.14
    åŃĹ
    -0.14
    izr
    -0.13
    meden
    -0.13
    ennen
    -0.13
    liqu
    -0.13
     yat
    -0.13
    loys
    -0.12
     Hydra
    -0.12
    POSITIVE LOGITS
    /current
    0.18
    /original
    0.17
    asco
    0.16
    oti
    0.15
    alah
    0.15
    Enlarge
    0.15
    tout
    0.14
    ovan
    0.14
    ActionCreators
    0.14
    abe
    0.14
    Act Density 0.099%

    No Known Activations