INDEX
    Explanations

    expressions related to leaving and resulting actions or consequences

    New Auto-Interp
    Negative Logits
    ecided
    -0.14
    ahn
    -0.14
    .ix
    -0.14
    ège
    -0.14
    ottage
    -0.14
    plusplus
    -0.14
    jed
    -0.13
    drv
    -0.13
    svp
    -0.13
    oice
    -0.13
    POSITIVE LOGITS
     leaving
    0.90
     leave
    0.84
     Leave
    0.77
     leaves
    0.77
     Leaving
    0.75
    Leave
    0.74
    leave
    0.72
     Leaves
    0.66
    _leave
    0.56
    çķĻ
    0.55
    Act Density 0.185%

    No Known Activations