INDEX
    Explanations

    single quotes and punctuation marks, indicating a focus on direct quotes or dialogue within the text

    numbers and specifications

    New Auto-Interp
    Negative Logits
     but
    -0.31
    Fortunately
    -0.26
    .
    -0.25
     what
    -0.25
     either
    -0.25
    Luckily
    -0.24
     Forschungs
    -0.23
     after
    -0.23
     Fortunately
    -0.22
     rağmen
    -0.22
    POSITIVE LOGITS
    GEBURTSDATUM
    0.99
     autorytatywna
    0.97
     noDo
    0.90
     <<<<<<<<<<<<<<
    0.88
     Normdatei
    0.85
    :✨
    0.85
    <pad>
    0.83
    <unused51>
    0.82
    <unused43>
    0.82
    <unused23>
    0.82
    Act Density 0.434%

    No Known Activations