INDEX
    Explanations

    criticism and controversies surrounding political figures, particularly discussions related to alliances, statements, and actions of political leaders

    New Auto-Interp
    Negative Logits
     Ename
    -0.81
    <bos>
    -0.75
     unspeak
    -0.74
     enlight
    -0.70
     apprehen
    -0.70
     unwarran
    -0.68
     Washable
    -0.68
     Permeability
    -0.67
     encomp
    -0.65
     tolerably
    -0.65
    POSITIVE LOGITS
     ideolog
    0.67
     himself
    0.64
    himself
    0.59
     republi
    0.58
     Obrador
    0.58
     religione
    0.57
     Republi
    0.56
     misst
    0.56
     akus
    0.54
     biograf
    0.54
    Act Density 0.696%

    No Known Activations