INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     surpass
    -0.08
    ไซต
    -0.06
     friendly
    -0.06
    LAS
    -0.06
     Sign
    -0.06
     Naughty
    -0.06
     Zeus
    -0.06
     ByteArrayOutputStream
    -0.06
     фунда
    -0.06
     Chairman
    -0.06
    POSITIVE LOGITS
     Winter
    0.11
     winter
    0.10
    Winter
    0.09
     winters
    0.08
    Inter
    0.08
    .inter
    0.08
    within
    0.07
    modern
    0.07
    aptor
    0.07
    wind
    0.07
    Act Density 0.005%

    No Known Activations