INDEX
    Explanations

    words related to change or transformation

    terms related to transformation or change

    New Auto-Interp
    Negative Logits
    bis
    -0.73
    PLIED
    -0.70
    avering
    -0.60
    Found
    -0.60
     WARN
    -0.59
     bast
    -0.59
    ept
    -0.58
    oiler
    -0.58
    llah
    -0.58
    blers
    -0.56
    POSITIVE LOGITS
     into
    1.21
    into
    1.10
    ively
    1.05
     INTO
    0.98
     Into
    0.87
    ational
    0.81
    ative
    0.79
    ives
    0.78
    atted
    0.75
    ELF
    0.72
    Act Density 0.072%

    No Known Activations