INDEX
    Explanations

    phrases related to negative actions or events

    negative phrases that imply criticism or conflict

    New Auto-Interp
    Negative Logits
    ously
    -0.75
    HL
    -0.64
    edo
    -0.61
    â̦â̦â̦â̦â̦â̦â̦â̦
    -0.60
    /(
    -0.58
    oks
    -0.58
    xes
    -0.57
    entials
    -0.57
    :[
    -0.56
    .):
    -0.56
    POSITIVE LOGITS
    _-
    1.74
    webkit
    0.89
    =-=-=-=-=-=-=-=-
    0.83
     ie
    0.77
     [|
    0.76
    /-
    0.74
    =-=-=-=-
    0.74
    named
    0.71
    cens
    0.69
    enough
    0.69
    Act Density 0.066%

    No Known Activations