INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Prostate
    0.74
     prostate
    0.70
     limb
    0.69
    tests
    0.67
    PrivateKey
    0.66
    UnitTest
    0.66
    Tests
    0.65
     Toxicology
    0.65
    0.64
    Toilet
    0.64
    POSITIVE LOGITS
     stadium
    0.69
    广播
    0.69
     Rac
    0.68
     sted
    0.68
     фикси
    0.67
     fixed
    0.67
     Fixed
    0.67
     carriers
    0.66
     arreg
    0.65
     surviv
    0.62
    Act Density 0.002%

    No Known Activations