INDEX
    Explanations

    references to legal or political issues

    New Auto-Interp
    Negative Logits
     voks
    -0.16
    iyan
    -0.15
    cede
    -0.15
     bekl
    -0.14
     ragaz
    -0.14
    altar
    -0.14
     Wyn
    -0.14
     Andrews
    -0.14
     Pf
    -0.14
     Kinder
    -0.14
    POSITIVE LOGITS
     og
    0.22
     och
    0.19
     på
    0.17
    å
    0.17
    aler
    0.17
     nr
    0.17
    igh
    0.15
    ø
    0.15
    .nr
    0.15
    isen
    0.15
    Act Density 0.182%

    No Known Activations