INDEX
    Explanations

    references to moderation in a community or platform context

    New Auto-Interp
    Negative Logits
    uxxxx
    -0.33
    -0.31
    mentare
    -0.30
    ittu
    -0.29
    Caja
    -0.29
     currently
    -0.28
     Czer
    -0.27
    രിക്ക
    -0.27
     Iyer
    -0.27
    äume
    -0.27
    POSITIVE LOGITS
    Mod
    2.59
     Mod
    2.42
    Mods
    1.70
    mods
    1.65
     Mods
    1.62
     MOD
    1.61
     mod
    1.55
    mod
    1.55
    MOD
    1.53
     mods
    1.44
    Act Density 0.003%

    No Known Activations