INDEX
    Explanations

    mentions of shocking or noteworthy events, possibly involving crime or controversy

    New Auto-Interp
    Negative Logits
     hairc
    -0.90
     despotism
    -0.88
     tupperware
    -0.85
     ecru
    -0.81
     newArr
    -0.81
     cushi
    -0.80
     newList
    -0.79
     newVal
    -0.79
     swarovski
    -0.79
     philosophic
    -0.79
    POSITIVE LOGITS
     autorytatywna
    0.88
    <bos>
    0.75
    expandindo
    0.69
    はじめに
    0.68
    mdash
    0.62
    ongiorno
    0.58
     During
    0.57
    hammad
    0.57
     When
    0.56
     <>",
    0.55
    Act Density 0.099%

    No Known Activations