INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    OD
    -0.06
     Asp
    -0.06
     elastic
    -0.06
    Estado
    -0.06
    щество
    -0.06
    	address
    -0.06
    omorphic
    -0.06
    output
    -0.06
    ACTION
    -0.06
    climate
    -0.06
    POSITIVE LOGITS
    dued
    0.07
    [++
    0.07
     exercise
    0.07
    ThanOr
    0.07
     Υπο
    0.07
     fanatic
    0.07
     finalist
    0.07
     tyto
    0.07
    :both
    0.06
     robber
    0.06
    Act Density 0.007%

    No Known Activations