INDEX
    Explanations

    descriptive adjectives, particularly the word "pretty"

    New Auto-Interp
    Negative Logits
    ooth
    -0.17
    odelist
    -0.17
    asca
    -0.16
    δÏģα
    -0.16
    vary
    -0.15
    да
    -0.14
    нÑĤ
    -0.14
     principal
    -0.14
    eric
    -0.14
    hatt
    -0.14
    POSITIVE LOGITS
    »
    0.16
    -ÑĤаки
    0.15
    ayne
    0.15
    izm
    0.14
    ums
    0.14
    ve
    0.13
    andon
    0.13
    EFR
    0.13
    lish
    0.13
    angelo
    0.13
    Act Density 0.014%

    No Known Activations