INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     purpoſe
    -0.65
     pleaſure
    -0.58
    PerformLayout
    -0.57
     ſtate
    -0.55
     jSON
    -0.54
     acrylique
    -0.52
     Majefty
    -0.48
     ſtand
    -0.47
     ainfi
    -0.47
     perſon
    -0.46
    POSITIVE LOGITS
     without
    1.73
    without
    1.73
    Without
    1.67
     Without
    1.63
     WITHOUT
    1.55
    WITHOUT
    1.55
    Ohne
    1.22
     без
    1.20
     senza
    1.20
     zonder
    1.18
    Act Density 0.027%

    No Known Activations