INDEX
    Explanations

    phrases related to allegations and claims of wrongdoing

    New Auto-Interp
    Negative Logits
     allegedly
    -0.19
     reportedly
    -0.19
     supposedly
    -0.15
    ught
    -0.15
     пÑĢедпол
    -0.15
    art
    -0.15
     arguably
    -0.14
    ialized
    -0.14
    iker
    -0.14
    plevel
    -0.14
    POSITIVE LOGITS
    /pro
    0.19
    hood
    0.18
    LY
    0.18
    soon
    0.17
    ance
    0.17
     soon
    0.17
     future
    0.17
    ly
    0.17
    lys
    0.16
    ;y
    0.16
    Act Density 0.080%

    No Known Activations