INDEX
    Explanations

    references to political figures and activities, particularly those relating to controversies and power dynamics

    New Auto-Interp
    Negative Logits
    awtextra
    -0.52
    flox
    -0.51
     BIBSYS
    -0.50
     дописавши
    -0.50
    ToScroll
    -0.49
    Iné
    -0.47
    mender
    -0.46
    utnik
    -0.46
    Württemberg
    -0.46
    principalColumn
    -0.45
    POSITIVE LOGITS
     ph
    1.28
     Ph
    1.28
    Ph
    1.16
    ph
    1.09
     PH
    1.02
     PHILL
    0.95
     PHO
    0.95
     PHILLIPS
    0.94
     phi
    0.92
    PH
    0.91
    Act Density 0.147%

    No Known Activations