INDEX
    Explanations

    instances of the beginning of sections or paragraphs in text

    New Auto-Interp
    Negative Logits
    OGND
    -0.99
    -0.75
     فريبيس
    -0.70
    Personensuche
    -0.69
     autorytatywna
    -0.68
     محفوظة
    -0.68
    Autoritní
    -0.68
     insuffisamment
    -0.68
     المعيارى
    -0.68
    __(/*!
    -0.67
    POSITIVE LOGITS
    !”
    0.67
    ,”
    0.63
    ,’”
    0.60
    ↵↵
    0.58
    ”,
    0.56
    \\
    0.54
    ),”
    0.53
    .”
    0.53
    !’
    0.52
    0.51
    Act Density 0.028%

    No Known Activations