INDEX
    Explanations

    references to individuals involved in controversial political or social contexts

    New Auto-Interp
    Negative Logits
    deniz
    -0.17
     interact
    -0.17
     interacting
    -0.16
     vs
    -0.15
     versus
    -0.15
     allied
    -0.14
     interacts
    -0.14
     junto
    -0.14
     Allied
    -0.14
    ATALOG
    -0.14
    POSITIVE LOGITS
     whom
    0.30
    871
    0.16
    _tF
    0.15
     whose
    0.15
     mutual
    0.15
     who
    0.15
     quien
    0.14
     tut
    0.14
    uele
    0.14
     recip
    0.14
    Act Density 0.364%

    No Known Activations