INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itſelf
    -0.92
     myſelf
    -0.87
     Theſe
    -0.83
     Jefus
    -0.80
     Monfieur
    -0.79
     faſt
    -0.78
     himſelf
    -0.76
     purpoſe
    -0.75
     pleaſure
    -0.74
     neceff
    -0.74
    POSITIVE LOGITS
    api
    0.77
     library
    0.67
    API
    0.63
    fa
    0.63
    Api
    0.60
     Library
    0.60
    قایناقلار
    0.59
     API
    0.55
     Api
    0.54
     about
    0.54
    Act Density 0.099%

    No Known Activations