INDEX
    Explanations

    reverse opposite inverse

    New Auto-Interp
    Negative Logits
    injlim
    0.57
    0.56
     felhasznál
    0.55
     चल
    0.55
     rougeâtres
    0.54
     couper
    0.53
    <unused73>
    0.53
    ወሰ
    0.53
     fett
    0.53
     புக
    0.52
    POSITIVE LOGITS
     reverse
    4.32
     opposite
    4.21
     Reverse
    3.92
    reverse
    3.89
     reversed
    3.85
    Reverse
    3.84
    opposite
    3.75
     inverse
    3.67
     Opposite
    3.53
     reverses
    3.37
    Act Density 0.358%

    No Known Activations