INDEX
    Explanations

    instances of the word "out"

    New Auto-Interp
    Negative Logits
    oleon
    -0.72
     tyr
    -0.65
     Municip
    -0.64
    issance
    -0.63
     Trailer
    -0.61
     Corps
    -0.61
    Tea
    -0.60
     Lect
    -0.57
     Conversation
    -0.57
     Tao
    -0.56
    POSITIVE LOGITS
    fitted
    1.05
    number
    1.03
    range
    1.01
    scoring
    1.00
    stretched
    0.96
    liest
    0.92
    fitting
    0.92
    ranking
    0.91
    doing
    0.89
    crop
    0.89
    Act Density 0.029%

    No Known Activations