INDEX
    Explanations

    specific Japanese words and punctuation marks in the text

    New Auto-Interp
    Negative Logits
     Roskov
    -0.64
    Hentet
    -0.59
    +#+
    -0.56
    twimg
    -0.52
     Chwiliwch
    -0.51
    fromnode
    -0.50
     arşivlendi
    -0.48
     Taktlose
    -0.46
     NSCoder
    -0.46
     Signalez
    -0.46
    POSITIVE LOGITS
     navideña
    0.38
     Nara
    0.36
    mop
    0.36
    juna
    0.35
    CloseOperation
    0.35
     CWE
    0.34
    0.34
    :]:
    0.34
    timme
    0.33
    jenih
    0.33
    Act Density 0.053%

    No Known Activations