INDEX
    Explanations

    phrases related to irreversibility or permanent damage

    terms related to irreversibility and permanent change

    New Auto-Interp
    Negative Logits
    ramid
    -0.80
     Hunters
    -0.77
    anwhile
    -0.74
    ucket
    -0.73
     guiActiveUnfocused
    -0.71
     Trials
    -0.70
    owler
    -0.69
    wagen
    -0.67
     Butterfly
    -0.67
    auri
    -0.67
    POSITIVE LOGITS
    voc
    1.29
    parable
    1.02
     irre
    0.96
    viation
    0.91
    agan
    0.88
    itable
    0.87
    asonable
    0.87
    lev
    0.86
    ality
    0.85
    cover
    0.85
    Act Density 0.012%

    No Known Activations