INDEX
    Explanations

    instances of greeting phrases or exclamations

    New Auto-Interp
    Negative Logits
    stice
    -0.16
    rip
    -0.15
    .uk
    -0.15
    orian
    -0.14
    ouis
    -0.14
    LESS
    -0.14
       
    -0.14
    amd
    -0.14
    andbox
    -0.13
    mdir
    -0.13
    POSITIVE LOGITS
    assed
    0.16
     prest
    0.16
    ettle
    0.15
    arken
    0.15
    assen
    0.15
     infr
    0.15
    AGR
    0.15
    ullo
    0.15
     Ler
    0.15
    ays
    0.14
    Act Density 0.016%

    No Known Activations