INDEX
    Explanations

    leaving something behind

    New Auto-Interp
    Negative Logits
    9
    0.54
    3
    0.52
    f
    0.52
    losti
    0.52
    c
    0.52
    5
    0.49
    6
    0.47
    		
    0.46
     לאחר
    0.46
     avere
    0.45
    POSITIVE LOGITS
     Leaving
    1.03
     leave
    0.97
     оставля
    0.96
     leaving
    0.94
     Leave
    0.91
     laissé
    0.91
     laissent
    0.89
    leaving
    0.87
     laisse
    0.85
     deixa
    0.82
    Act Density 0.022%

    No Known Activations