INDEX
    Explanations

    punctuation marks that indicate the flow or structure of text

    New Auto-Interp
    Negative Logits
     Whilst
    -0.21
     whilst
    -0.19
    Whilst
    -0.14
    Many
    -0.13
    igin
    -0.13
    asic
    -0.13
    indow
    -0.12
    atron
    -0.12
    ÙĬÙĥÙĬ
    -0.12
     Tarif
    -0.12
    POSITIVE LOGITS
     Or
    0.26
     Hell
    0.25
    Um
    0.24
     Um
    0.24
     Seriously
    0.23
    Seriously
    0.23
     honestly
    0.23
    Or
    0.23
     honest
    0.22
    Hell
    0.21
    Act Density 0.241%

    No Known Activations