INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     découv
    -1.13
    -0.97
    Bie
    -0.97
    满脸
    -0.96
    vdash
    -0.95
    ґ
    -0.95
     rencontr
    -0.94
    olog
    -0.94
     rencont
    -0.94
    fol
    -0.93
    POSITIVE LOGITS
     return
    1.86
     continue
    1.63
     result
    1.52
     break
    1.46
    return
    1.38
     also
    1.23
     throw
    1.16
     results
    1.14
     returned
    1.09
    break
    1.06
    Act Density 0.018%

    No Known Activations