INDEX
    Explanations

    references to specific individuals or groups in leadership or authoritative positions

    New Auto-Interp
    Negative Logits
    ingleton
    -0.17
    vail
    -0.15
    buch
    -0.15
    importe
    -0.15
    frei
    -0.14
    _patch
    -0.14
    ester
    -0.14
    apiro
    -0.14
    hari
    -0.14
     LÃłm
    -0.14
    POSITIVE LOGITS
    yles
    0.19
     Rob
    0.15
    232
    0.14
    anch
    0.14
     Cub
    0.14
    ibi
    0.14
     Bru
    0.14
    rium
    0.14
    on
    0.14
    boro
    0.14
    Act Density 0.032%

    No Known Activations