INDEX
    Explanations

    Political elections/politicians

    New Auto-Interp
    Negative Logits
     adultery
    -0.08
    不到
    -0.07
    ondo
    -0.07
    .Callback
    -0.06
     bad
    -0.06
    ABLE
    -0.06
    (case
    -0.06
     nrw
    -0.06
    tk
    -0.06
     věd
    -0.06
    POSITIVE LOGITS
    ervative
    0.06
    \↵
    0.06
    915
    0.06
     hacker
    0.06
     Milano
    0.06
    /model
    0.06
     %↵
    0.06
     });↵↵
    0.06
    φων
    0.06
    Res
    0.06
    Act Density 0.010%

    No Known Activations