INDEX
    Explanations

    references to governmental policies and political discussions

    New Auto-Interp
    Negative Logits
    isode
    -0.69
     regist
    -0.67
     board
    -0.66
     sled
    -0.64
     Manhattan
    -0.63
     scatter
    -0.62
     decomp
    -0.62
     charm
    -0.61
     mans
    -0.60
     scene
    -0.60
    POSITIVE LOGITS
    ¬
    1.11
    Ĵ
    1.05
    ¡
    1.04
    ¤
    1.00
    ij
    0.97
    Ļ
    0.96
    ı
    0.95
    ľ
    0.93
    _.
    0.93
    Ī
    0.91
    Act Density 0.479%

    No Known Activations