INDEX
    Explanations

    mentions of conservative ideologies and associated terms

    New Auto-Interp
    Negative Logits
    ings
    -0.16
     меÑĢ
    -0.15
    ollar
    -0.15
    ONSE
    -0.14
    itoris
    -0.14
    нами
    -0.14
    нам
    -0.14
    pear
    -0.13
    idar
    -0.13
     PartialView
    -0.13
    POSITIVE LOGITS
    -leaning
    0.19
    /lib
    0.16
    aggio
    0.16
    /social
    0.16
    unker
    0.14
    451
    0.14
     friendly
    0.14
     princ
    0.14
    -social
    0.14
    èģĶ
    0.14
    Act Density 0.028%

    No Known Activations