INDEX
    Explanations

    test scenarios and correctness

    New Auto-Interp
    Negative Logits
    clonal
    0.44
    GRANT
    0.42
    Appro
    0.39
     приблизи
    0.39
    Approx
    0.38
    physema
    0.38
    оте
    0.38
    лье
    0.38
    прода
    0.38
    τεί
    0.38
    POSITIVE LOGITS
     correctness
    0.60
     tests
    0.59
     symmetries
    0.59
     robustness
    0.58
     trigonometric
    0.58
     symmetry
    0.56
     tested
    0.52
     correctly
    0.52
     cases
    0.51
    错误
    0.51
    Act Density 0.317%

    No Known Activations