INDEX
    Explanations

    Controversial/Negative topics

    New Auto-Interp
    Negative Logits
     Jana
    -0.07
    Nobody
    -0.07
     činnosti
    -0.06
     Joan
    -0.06
    ização
    -0.06
     contar
    -0.06
     sektör
    -0.06
     humanities
    -0.06
     psyched
    -0.06
     EVE
    -0.06
    POSITIVE LOGITS
     gc
    0.07
     Invoice
    0.07
     Modifier
    0.06
    イル
    0.06
    838
    0.06
     lst
    0.06
    (:,
    0.06
     icmp
    0.06
    erged
    0.06
    emplate
    0.06
    Act Density 0.098%

    No Known Activations