INDEX
    Explanations

    specific formatting elements or structural markers in the text

    New Auto-Interp
    Negative Logits
     Tone
    -0.17
    bach
    -0.16
    ices
    -0.16
    ntax
    -0.15
    ACES
    -0.15
    aget
    -0.15
     comp
    -0.14
     Liberty
    -0.14
     Birch
    -0.14
    abbage
    -0.14
    POSITIVE LOGITS
     Cooke
    0.18
    eden
    0.15
    -terminal
    0.15
    caff
    0.15
    hea
    0.14
    kah
    0.14
    мени
    0.14
    anko
    0.14
    âĸĪâĸĪ
    0.14
    HOOK
    0.14
    Act Density 0.028%

    No Known Activations