INDEX
    Explanations

    punctuation marks, particularly periods

    New Auto-Interp
    Negative Logits
    ropolis
    -0.15
    478
    -0.15
    ursed
    -0.15
    κι
    -0.15
    ONO
    -0.15
    onic
    -0.14
    aval
    -0.14
    ept
    -0.14
    bic
    -0.14
     кÑĥ
    -0.14
    POSITIVE LOGITS
     Mai
    0.15
    ucas
    0.14
    ovan
    0.14
    aub
    0.14
    tel
    0.14
     Macros
    0.14
    aju
    0.14
     macros
    0.14
    breadcrumb
    0.14
    .Inject
    0.13
    Act Density 0.002%

    No Known Activations