INDEX
    Explanations

    phrases with the word "which" followed by a subject or object

    occurrences of delimiters or punctuation, specifically the end of text markers

    New Auto-Interp
    Negative Logits
     Vaugh
    -0.68
     Seym
    -0.67
     Niet
    -0.60
    Instead
    -0.60
     Darling
    -0.58
    Fram
    -0.57
    Tokens
    -0.57
     Frie
    -0.56
     disadvant
    -0.55
    Daddy
    -0.54
    POSITIVE LOGITS
    zbollah
    0.82
    ersive
    0.70
    imes
    0.62
     awaits
    0.61
    ims
    0.60
     awaited
    0.60
     rejo
    0.60
    embed
    0.60
    ;;;;;;;;;;;;
    0.59
    usalem
    0.59
    Act Density 0.097%

    No Known Activations