INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Портал
    -0.54
     térmico
    -0.50
    XmlAccessorType
    -0.49
    Artic
    -0.47
    Cork
    -0.47
    IOM
    -0.46
    kunci
    -0.45
     Zuma
    -0.45
    RegressionTest
    -0.45
    columnHeader
    -0.45
    POSITIVE LOGITS
     instead
    1.93
    instead
    1.78
    Instead
    1.63
     Instead
    1.59
     вместо
    1.32
     zamiast
    1.27
     istället
    1.20
     invece
    1.19
     rather
    1.09
    vece
    1.07
    Act Density 0.017%

    No Known Activations