INDEX
    Explanations

    mentions of false accusations or wrongdoing

    New Auto-Interp
    Negative Logits
    hens
    -1.05
     guiActiveUnfocused
    -0.98
    hem
    -0.81
    rador
    -0.77
    hetti
    -0.76
    xual
    -0.72
    oké
    -0.71
    ajo
    -0.70
    lov
    -0.69
    asio
    -0.69
    POSITIVE LOGITS
     positives
    1.10
     guiActiveUn
    0.99
     dich
    0.91
     accuser
    0.88
     guiIcon
    0.88
     accusation
    0.86
    ulent
    0.86
     alarms
    0.83
     assumptions
    0.77
    ulence
    0.77
    Act Density 0.027%

    No Known Activations