INDEX
    Explanations

    references to people and societal interactions or behaviors

    New Auto-Interp
    Negative Logits
     conte
    -0.47
    cephala
    -0.45
    éron
    -0.44
    faz
    -0.44
    two
    -0.44
     conseillers
    -0.43
    φορά
    -0.42
    EPI
    -0.42
    Unsigned
    -0.42
    é
    -0.42
    POSITIVE LOGITS
    MLLoader
    0.85
     ppl
    0.84
     peoples
    0.83
    ieteur
    0.82
    InitVars
    0.81
    roslav
    0.79
    UserScript
    0.78
    цездатний
    0.78
     people
    0.76
    таратура
    0.75
    Act Density 0.262%

    No Known Activations