INDEX
    Explanations

    mentions of winning or achievements

    instances of the word "win."

    New Auto-Interp
    Negative Logits
    erity
    -0.75
    umn
    -0.64
    ikk
    -0.64
     protr
    -0.62
     includ
    -0.62
     Uz
    -0.61
     footprint
    -0.60
    Else
    -0.60
    ciplinary
    -0.60
     assembled
    -0.59
    POSITIVE LOGITS
    ners
    0.91
    nings
    0.87
    now
    0.76
    win
    0.76
    throp
    0.75
    ception
    0.74
    iors
    0.74
    iem
    0.74
    ces
    0.72
    't
    0.72
    Act Density 0.026%

    No Known Activations