INDEX
    Explanations

    references to complex legal or political issues

    New Auto-Interp
    Negative Logits
    ه
    -0.96
    ing
    -0.94
     Parke
    -0.78
    ی
    -0.77
     Colbert
    -0.76
    es
    -0.73
    er
    -0.72
    ené
    -0.71
    ená
    -0.70
    tka
    -0.69
    POSITIVE LOGITS
    findpost
    0.97
    swag
    0.95
     Mousse
    0.91
     Rabin
    0.91
    UpInside
    0.87
     Durand
    0.84
     Washer
    0.83
    omány
    0.82
    suit
    0.82
     Oss
    0.80
    Act Density 0.496%

    No Known Activations