INDEX
    Explanations

    words related to the concept of reversal

    instances of the word "reverse."

    New Auto-Interp
    Negative Logits
     Athlet
    -0.69
    camp
    -0.68
    haw
    -0.67
    te
    -0.65
    ovie
    -0.63
     Matters
    -0.63
     Concern
    -0.62
    ten
    -0.62
    fle
    -0.62
    lvl
    -0.61
    POSITIVE LOGITS
     reverse
    3.82
    reverse
    2.77
     Reverse
    2.52
     reversed
    2.24
     reversing
    2.11
     inverse
    1.74
     reversal
    1.68
     revers
    1.63
     inverted
    1.44
     flip
    1.31
    Act Density 0.014%

    No Known Activations