INDEX
    Explanations

    phrases related to policies and decision-making

    conjunctions, particularly the word "and" and its variations

    New Auto-Interp
    Negative Logits
    bats
    -0.80
    =>
    -0.79
    ONSORED
    -0.75
    owe
    -0.74
    tumblr
    -0.73
    urious
    -0.73
    hov
    -0.72
    ":["
    -0.71
    ugi
    -0.71
     )]
    -0.71
    POSITIVE LOGITS
     nephew
    0.61
     Centauri
    0.61
     Lucifer
    0.61
     grace
    0.59
     bake
    0.58
     vomiting
    0.57
    ablishment
    0.56
     Gaia
    0.55
    cmp
    0.55
     Gamble
    0.54
    Act Density 0.162%

    No Known Activations