INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     minden
    -0.08
     declining
    -0.07
     interfere
    -0.07
     err
    -0.07
     Coming
    -0.07
     gn
    -0.07
     mend
    -0.07
     tuples
    -0.07
     January
    -0.07
     prolonged
    -0.07
    POSITIVE LOGITS
     İngiliz
    0.09
    0.08
    0.08
     physical
    0.08
    phy
    0.08
     fís
    0.08
     Physics
    0.08
     плот
    0.08
    imple
    0.08
     kích
    0.08
    Act Density 0.032%

    No Known Activations