INDEX
    Explanations

    punctuation and questioning expressions

    New Auto-Interp
    Negative Logits
    ublish
    -0.18
    tron
    -0.15
    opis
    -0.14
    ombok
    -0.14
    夫
    -0.14
    ythe
    -0.13
    .scalajs
    -0.13
    .basic
    -0.13
     Gardens
    -0.13
    hiro
    -0.13
    POSITIVE LOGITS
     so
    0.34
     So
    0.30
    So
    0.28
    éĤ£ä¹Ī
    0.28
    	So
    0.25
     why
    0.22
     Why
    0.22
    so
    0.21
     VáºŃy
    0.21
    -so
    0.20
    Act Density 0.116%

    No Known Activations