INDEX
    Explanations

    words or phrases related to being banned or barred from certain actions or places

    terms related to restrictions or prohibitions

    New Auto-Interp
    Negative Logits
    worn
    -0.81
    Engineers
    -0.73
    imon
    -0.68
    cule
    -0.66
    hidden
    -0.66
    framework
    -0.65
     disappoint
    -0.65
    dimension
    -0.65
    ensional
    -0.63
     });
    -0.63
    POSITIVE LOGITS
     indefinitely
    0.88
     altogether
    0.85
    atever
    0.82
     interstate
    0.76
     premises
    0.72
     activities
    0.71
     accessing
    0.70
     participating
    0.70
     flights
    0.69
     offending
    0.68
    Act Density 0.477%

    No Known Activations