INDEX
    Explanations

    occurrences of the word "First."

    New Auto-Interp
    Negative Logits
    arat
    -0.18
    895
    -0.17
    797
    -0.15
    uw
    -0.15
     Vul
    -0.15
    ê°Ħ
    -0.15
    itet
    -0.15
    898
    -0.15
     incident
    -0.15
    097
    -0.14
    POSITIVE LOGITS
    awks
    0.16
    ngo
    0.15
    esch
    0.15
    หลวà¸ĩ
    0.14
    onald
    0.14
     Pods
    0.14
    pace
    0.14
    -feedback
    0.14
    holm
    0.14
    weis
    0.14
    Act Density 0.031%

    No Known Activations