INDEX
    Explanations

    politically charged language surrounding government actions and controversies

    New Auto-Interp
    Negative Logits
     nece
    -0.58
    --){
    -0.56
    είται
    -0.56
     OnEnable
    -0.56
     indisponible
    -0.55
     susten
    -0.54
     transférez
    -0.54
     Nego
    -0.54
    ʁ
    -0.54
    nhold
    -0.53
    POSITIVE LOGITS
     pourtant
    0.66
    StructEnd
    0.52
    GeneratedCode
    0.51
    ItemBackground
    0.50
     sekal
    0.49
    ftagPool
    0.49
     rağmen
    0.45
     FAILED
    0.44
     failed
    0.43
     supposedly
    0.43
    Act Density 0.495%

    No Known Activations