INDEX
    Explanations

    promote headers

    New Auto-Interp
    Negative Logits
    utron
    -0.09
     corticost
    -0.09
     neutron
    -0.09
     chirurgie
    -0.08
     nark
    -0.08
     эколог
    -0.08
     erit
    -0.08
     whiskey
    -0.08
     kub
    -0.08
     opioid
    -0.08
    POSITIVE LOGITS
    .iloc
    0.11
    <thead
    0.09
    andas
    0.09
    thead
    0.08
    0.08
    Fixture
    0.08
    0.08
     capt
    0.08
     row
    0.08
    Layer
    0.08
    Act Density 0.004%

    No Known Activations