INDEX
    Explanations

    phrases related to unexpected outcomes or transformations

    instances of the word "turn" and its variations, indicating changes or transformations in states or situations

    New Auto-Interp
    Negative Logits
    è¦ļéĨĴ
    -1.03
    heed
    -0.67
    icipated
    -0.64
    merga
    -0.64
    ording
    -0.64
     Ys
    -0.61
    pton
    -0.61
    ority
    -0.60
    cussion
    -0.60
    erella
    -0.60
    POSITIVE LOGITS
     into
    1.09
     sour
    1.09
     ugly
    0.99
     upside
    0.95
     violent
    0.86
     inward
    0.85
     INTO
    0.84
    coat
    0.82
     stale
    0.82
     heads
    0.79
    Act Density 0.032%

    No Known Activations