INDEX
    Explanations

    negative statements about political power

    New Auto-Interp
    Negative Logits
    })`
    -0.74
    )";
    
    -0.55
    }")
    
    -0.53
    )");
    
    -0.51
     Tense
    -0.49
     nôtre
    -0.49
    }]
    
    -0.49
    AndEndTag
    -0.48
    )]
    
    -0.48
    '):
    
    -0.47
    POSITIVE LOGITS
     second
    1.07
    second
    0.88
     Second
    0.78
     SECOND
    0.77
     third
    0.76
     secondary
    0.74
     seconde
    0.73
    SECOND
    0.72
    Second
    0.71
     segundo
    0.68
    Act Density 1.732%

    No Known Activations