INDEX
    Explanations

    mentions of government officials or political figures

    New Auto-Interp
    Negative Logits
    ازÛĮ
    -0.16
     Marathon
    -0.15
    ERİ
    -0.14
    ynchronous
    -0.14
    ERY
    -0.14
    rière
    -0.14
    uft
    -0.14
    dden
    -0.14
    ards
    -0.14
    ictor
    -0.14
    POSITIVE LOGITS
    iors
    0.30
    egal
    0.30
    eca
    0.28
    pai
    0.23
    iores
    0.22
    ior
    0.22
    ile
    0.21
    IOR
    0.20
    esch
    0.19
    Sen
    0.19
    Act Density 0.009%

    No Known Activations