INDEX
    Explanations

    python function purpose

    New Auto-Interp
    Negative Logits
    outi
    0.41
    ల్యే
    0.37
    leans
    0.37
    <0xAC>
    0.36
     Estudio
    0.35
     Everybody
    0.35
    SAFER
    0.35
    Amanda
    0.35
     modo
    0.34
    unier
    0.34
    POSITIVE LOGITS
     accepts
    0.62
     принимает
    0.55
     accett
    0.53
     takes
    0.52
     принима
    0.52
     accepting
    0.52
     прийма
    0.50
     accept
    0.49
     Accepts
    0.48
     принимать
    0.48
    Act Density 0.011%

    No Known Activations