INDEX
    Explanations

    mentions of different possible outcomes or results

    mentions of "outcome" indicating results or consequences

    New Auto-Interp
    Negative Logits
    ker
    -0.77
    cer
    -0.72
    afort
    -0.71
    king
    -0.70
    Cola
    -0.69
    ovie
    -0.68
    nan
    -0.67
    pload
    -0.67
    uggage
    -0.66
    yi
    -0.66
    POSITIVE LOGITS
     outcome
    1.12
     outcomes
    1.02
    bringer
    0.88
     thereof
    0.76
     result
    0.72
    ebin
    0.71
     probabilities
    0.70
     Result
    0.69
     Winner
    0.67
     winner
    0.67
    Act Density 0.008%

    No Known Activations