INDEX
    Explanations

    discussions surrounding internet policies and regulations

    New Auto-Interp
    Negative Logits
    arias
    -0.16
    CRM
    -0.15
    phetamine
    -0.14
    ardu
    -0.14
     discharged
    -0.14
    _PROC
    -0.14
     eÅŁ
    -0.13
    795
    -0.13
     hang
    -0.13
    avra
    -0.13
    POSITIVE LOGITS
     blocking
    0.31
     blocks
    0.31
     blocked
    0.30
     block
    0.30
     filtering
    0.29
     Filtering
    0.27
     Blocks
    0.27
     content
    0.27
    blocking
    0.27
     Blocked
    0.26
    Act Density 0.025%

    No Known Activations