INDEX
    Explanations

    sentences related to ideology and beliefs

    political language surrounding issues of manipulation and power dynamics

    New Auto-Interp
    Negative Logits
    DragonMagazine
    -0.69
    iple
    -0.65
     Originally
    -0.61
     Deadline
    -0.60
    raft
    -0.59
    ©¶æ¥µ
    -0.58
     Warehouse
    -0.57
    ortium
    -0.56
    availability
    -0.56
     Medline
    -0.55
    POSITIVE LOGITS
     themselves
    0.80
     subord
    0.77
     ignor
    0.77
     immoral
    0.73
     unworthy
    0.70
     inconvenient
    0.70
     legitim
    0.70
     bigotry
    0.70
     ignorant
    0.69
     undue
    0.69
    Act Density 1.119%

    No Known Activations