INDEX
    Explanations

    quantifiable measures and terms related to reporting or contributions

    New Auto-Interp
    Negative Logits
    isation
    -0.18
    IFICATION
    -0.16
    ishment
    -0.15
    ination
    -0.14
    ation
    -0.14
    istence
    -0.14
    ENSITY
    -0.14
    ization
    -0.14
    ATION
    -0.14
    ICATION
    -0.14
    POSITIVE LOGITS
    izing
    0.36
    ing
    0.34
    ting
    0.31
    ising
    0.29
    uing
    0.28
    ning
    0.26
    zing
    0.26
    uring
    0.26
    ucing
    0.26
    ming
    0.26
    Act Density 0.029%

    No Known Activations