INDEX
    Explanations

    references to political positions or roles, especially those related to opposition or alternative leadership

    references to "shadow" roles or positions in political contexts

    New Auto-Interp
    Negative Logits
    urses
    -0.79
    ickr
    -0.78
    ktop
    -0.72
    keye
    -0.72
    artney
    -0.72
    renheit
    -0.70
    anchester
    -0.69
    aii
    -0.69
    OPLE
    -0.68
    TAIN
    -0.68
    POSITIVE LOGITS
    boxing
    1.01
    moon
    1.01
    loo
    0.95
    runners
    0.88
    fax
    0.84
    flame
    0.78
    fell
    0.77
    shadow
    0.76
    wra
    0.76
    runner
    0.76
    Act Density 0.044%

    No Known Activations