INDEX
    Explanations

    references to uncovering hidden information or secrets

    New Auto-Interp
    Negative Logits
     défend
    -0.56
    AfterEach
    -0.50
     smtplib
    -0.50
     reconnaît
    -0.50
     Aftermath
    -0.45
     entraîne
    -0.45
     Tanjung
    -0.44
     Muhamma
    -0.44
     dépasse
    -0.44
     accompagne
    -0.44
    POSITIVE LOGITS
     sappi
    0.93
    <bos>
    0.89
     sembrano
    0.80
     parlano
    0.80
     scopri
    0.79
     morire
    0.78
     vogli
    0.78
     anse
    0.75
     abbandon
    0.71
     torner
    0.71
    Act Density 0.306%

    No Known Activations