INDEX
    Explanations

    phrases related to different potential outcomes

    New Auto-Interp
    Negative Logits
    ker
    -0.77
    nan
    -0.76
    cer
    -0.74
    ju
    -0.70
    ondo
    -0.70
    uni
    -0.70
    uga
    -0.69
    uzz
    -0.69
    king
    -0.69
    elong
    -0.69
    POSITIVE LOGITS
     outcome
    1.20
     outcomes
    1.02
    bringer
    0.82
     result
    0.77
     Result
    0.76
     thereof
    0.76
     Winner
    0.71
     Orche
    0.70
     Cruel
    0.70
     winner
    0.69
    Act Density 0.011%

    No Known Activations