INDEX
    Explanations

    punctuation marks, specifically question marks and periods

    New Auto-Interp
    Negative Logits
     NÄĽm
    -0.16
    roman
    -0.15
    oran
    -0.15
     popis
    -0.14
    zimmer
    -0.14
    лан
    -0.14
    esson
    -0.14
    arel
    -0.14
    ella
    -0.14
    _Block
    -0.14
    POSITIVE LOGITS
     plaster
    0.16
     Wars
    0.15
    ulp
    0.15
    μιÏĥ
    0.15
    930
    0.15
    ινÏĮ
    0.14
    Wars
    0.14
    çĬ¬
    0.14
    929
    0.14
    awaiter
    0.14
    Act Density 0.002%

    No Known Activations