INDEX
    Explanations

    expressions of appreciation or compliments

    New Auto-Interp
    Negative Logits
     }}$}
    -0.92
     myſelf
    -0.85
    ſelf
    -0.81
     cherchés
    -0.77
    .[/
    -0.76
     ***/
    -0.75
    دانشنامهٔ
    -0.74
     ―――――
    -0.72
     itſelf
    -0.71
    !")
    
    -0.71
    POSITIVE LOGITS
    <eos>
    0.69
     I
    0.63
     it
    0.63
    //
    0.60
    podar
    0.60
     Go
    0.59
     The
    0.56
     To
    0.56
     Do
    0.55
     i
    0.54
    Act Density 0.151%

    No Known Activations