INDEX
    Explanations

    the beginning of new sections or paragraphs in a text

    New Auto-Interp
    Negative Logits
     IMDG
    -0.70
    BDL
    -0.68
    }}}$
    -0.65
     Pani
    -0.63
     Beller
    -0.63
     cheminée
    -0.63
    RSI
    -0.62
     Lijst
    -0.62
    колеп
    -0.60
    Urs
    -0.59
    POSITIVE LOGITS
    
    0.78
    FTFY
    0.75
     Elsa
    0.70
    ftagPool
    0.66
     Zang
    0.66
     BeautifulSoup
    0.66
    0.66
    saat
    0.65
    Elsa
    0.63
    ToProps
    0.63
    Act Density 0.060%

    No Known Activations