INDEX
    Explanations

    punctuation

    New Auto-Interp
    Negative Logits
     battlefield
    -0.06
     rows
    -0.06
     روان
    -0.06
    -0.06
     شكل
    -0.06
     собі
    -0.06
    -0.06
    forcement
    -0.06
    -0.06
    Refer
    -0.05
    POSITIVE LOGITS
     interesting
    0.07
    ldre
    0.07
    uros
    0.07
     JSName
    0.07
    slideDown
    0.07
    lamak
    0.06
     useSelector
    0.06
    gorith
    0.06
    otřeb
    0.06
     Ren
    0.06
    Act Density 0.020%

    No Known Activations