INDEX
    Explanations

    phrases related to actions and conditions in various contexts

    New Auto-Interp
    Negative Logits
     comb
    -0.16
    hte
    -0.15
    OTAL
    -0.15
    qc
    -0.14
    pedia
    -0.14
     corner
    -0.14
    eldorf
    -0.14
    azon
    -0.14
    297
    -0.13
    ิà¸į
    -0.13
    POSITIVE LOGITS
     reversed
    0.30
     vice
    0.29
     reverse
    0.28
    Reverse
    0.25
     reversal
    0.24
     Vice
    0.24
     Reverse
    0.23
    reverse
    0.23
     inverse
    0.23
     reversing
    0.23
    Act Density 0.149%

    No Known Activations