INDEX
    Explanations

    phrases related to causing instability or disruption

    terms related to instability and disruption in political or social contexts

    New Auto-Interp
    Negative Logits
    ramid
    -0.86
    atana
    -0.83
    ewitness
    -0.81
    tis
    -0.75
    uli
    -0.74
    Quotes
    -0.74
    aret
    -0.74
    une
    -0.74
    aro
    -0.73
    inct
    -0.71
    POSITIVE LOGITS
     destabil
    0.88
    ãĤ¼ãĤ¦ãĤ¹
    0.83
    ized
    0.75
    itic
    0.73
     Hels
    0.72
    ãĥ¼ãĥĨ
    0.71
     Mobil
    0.71
    izing
    0.69
    ciating
    0.68
     Lester
    0.68
    Act Density 0.049%

    No Known Activations