INDEX
    Explanations

    references to media outlets and their editorial stances

    New Auto-Interp
    Negative Logits
    èŃ
    -0.18
    okud
    -0.16
     ModelState
    -0.15
    istrovstvÃŃ
    -0.15
    arkin
    -0.14
    /INFO
    -0.14
    átor
    -0.14
    /wiki
    -0.14
     medium
    -0.14
    erosis
    -0.14
    POSITIVE LOGITS
    anca
    0.15
     lep
    0.14
    enties
    0.13
    _dispatch
    0.13
    GenerationStrategy
    0.13
    esis
    0.13
    ascript
    0.13
    aptops
    0.13
    Panel
    0.13
    afd
    0.13
    Act Density 0.380%

    No Known Activations