INDEX
    Explanations

    the word "out" or variations of it, such as "outs" and "OUT"

    variations of the word "out."

    New Auto-Interp
    Negative Logits
     arsen
    -0.90
    avorite
    -0.74
    ajo
    -0.68
    anguage
    -0.68
    --------------------------------
    -0.65
     subp
    -0.65
     downward
    -0.65
    ε
    -0.63
     trem
    -0.63
    ute
    -0.63
    POSITIVE LOGITS
    doors
    1.01
    lier
    0.96
    dated
    0.94
    landish
    0.94
    door
    0.94
    fitted
    0.92
    stretched
    0.92
    raged
    0.89
    numbered
    0.89
    fits
    0.87
    Act Density 0.036%

    No Known Activations