INDEX
    Explanations

    markers indicating the start of new sections or important content within texts

    New Auto-Interp
    Negative Logits
    es
    -0.71
    io
    -0.69
     Paz
    -0.65
    Paz
    -0.64
    <blockquote>
    -0.63
    de
    -0.63
     paz
    -0.63
    ly
    -0.62
    lat
    -0.62
     deals
    -0.61
    POSITIVE LOGITS
     }^{*}$
    1.39
     $=$
    1.36
     $)$
    1.34
     }}$,
    1.33
     $]$
    1.28
     $>$
    1.28
    $\$$
    1.27
     }}$
    1.27
     )}$
    1.25
    })}$
    1.23
    Act Density 0.223%

    No Known Activations