INDEX
    Explanations

    words related to changes, modifications, or transformations

    the word "change" in various contexts

    New Auto-Interp
    Negative Logits
    -+-+
    -0.76
    ç«
    -0.75
    amina
    -0.74
     AFB
    -0.72
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    -0.72
    mination
    -0.70
    sonian
    -0.70
    APH
    -0.69
    vern
    -0.69
    PLIED
    -0.68
    POSITIVE LOGITS
    overs
    0.81
    agents
    0.80
    ogue
    0.77
    ĸļ
    0.76
    over
    0.75
    xual
    0.70
    backs
    0.69
    atile
    0.69
     wrought
    0.68
    imedia
    0.68
    Act Density 0.039%

    No Known Activations