INDEX
    Explanations

    phrases indicating established criteria or benchmarks

    New Auto-Interp
    Negative Logits
    Standard
    -0.18
     Standards
    -0.18
     Standard
    -0.17
    _std
    -0.16
    standard
    -0.16
     standard
    -0.16
    ستاÙĨ
    -0.16
    std
    -0.16
    onder
    -0.16
    orman
    -0.16
    POSITIVE LOGITS
    ised
    0.51
    ization
    0.48
    ize
    0.42
    -issue
    0.40
    izing
    0.39
    isation
    0.38
    -setting
    0.38
    ized
    0.37
     deviation
    0.35
    izes
    0.35
    Act Density 0.034%

    No Known Activations