INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     conceded
    -0.07
     Likely
    -0.07
     house
    -0.07
    над
    -0.06
     disables
    -0.06
    ausible
    -0.06
     bash
    -0.06
     glass
    -0.06
     Zero
    -0.06
    .inflate
    -0.06
    POSITIVE LOGITS
    _softmax
    0.07
     مق
    0.07
    $(".
    0.06
    GN
    0.06
    -treated
    0.06
    0.06
    	gr
    0.06
    >tagger
    0.06
     suffers
    0.06
     задов
    0.06
    Act Density 0.030%

    No Known Activations