INDEX
    Explanations

    mentions or instances of rules or restrictions being enforced or proposed

    references to prohibitions or bans on various subjects

    New Auto-Interp
    Negative Logits
     IMAGES
    -0.68
     prest
    -0.65
     rendition
    -0.64
     Maid
    -0.64
     Generations
    -0.63
     Turk
    -0.63
     Tuc
    -0.61
     Cous
    -0.61
     contrace
    -0.60
     Io
    -0.60
    POSITIVE LOGITS
    ishment
    1.43
    hammer
    1.15
    ishing
    1.13
    eful
    1.11
    lie
    0.99
    jo
    0.98
    zai
    0.95
    tering
    0.95
    nered
    0.94
    ning
    0.93
    Act Density 0.046%

    No Known Activations