INDEX
    Explanations

    punctuation marks and expressions of emotion or tone

    New Auto-Interp
    Negative Logits
    ceae
    -0.16
     mor
    -0.16
     cons
    -0.15
     tra
    -0.14
    è§
    -0.14
     twig
    -0.14
    oÄį
    -0.14
     hung
    -0.14
    roids
    -0.14
    &_
    -0.14
    POSITIVE LOGITS
    stan
    0.15
    živ
    0.14
     Yao
    0.14
    artin
    0.14
    avis
    0.14
    enson
    0.14
     lÃŃ
    0.14
    _DM
    0.14
    semb
    0.14
    .IGNORE
    0.14
    Act Density 0.184%

    No Known Activations