INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (am
    -0.07
    	fun
    -0.06
     swingerclub
    -0.06
    After
    -0.06
    .put
    -0.06
    Ross
    -0.06
    after
    -0.06
    .fail
    -0.06
    ali
    -0.06
    pis
    -0.06
    POSITIVE LOGITS
     decid
    0.07
     справж
    0.07
    år
    0.07
    MemoryWarning
    0.07
     shielding
    0.06
    Direccion
    0.06
    0.06
     ORD
    0.06
    0.06
    False
    0.06
    Act Density 0.000%

    No Known Activations