INDEX
    Explanations

    action movies

    New Auto-Interp
    Negative Logits
    	ax
    -0.06
     حسين
    -0.06
    альный
    -0.06
     notification
    -0.06
    ули
    -0.06
     rows
    -0.06
     Abdul
    -0.06
     vermek
    -0.06
    mtree
    -0.06
     Prostitutas
    -0.06
    POSITIVE LOGITS
     satur
    0.07
    ournal
    0.07
    aar
    0.07
     paranormal
    0.07
     реє
    0.07
    Began
    0.06
     Soda
    0.06
    0.06
     darn
    0.06
    0.06
    Act Density 0.036%

    No Known Activations