INDEX
    Explanations

    positive qualities or high levels of significance in text

    descriptors of quality and speed

    New Auto-Interp
    Negative Logits
     Maze
    -0.69
    odder
    -0.59
    adelphia
    -0.58
     Nile
    -0.58
     Ariel
    -0.58
     Ki
    -0.58
     Insight
    -0.57
     Nath
    -0.57
     Jude
    -0.57
    EE
    -0.56
    POSITIVE LOGITS
    ractive
    0.79
     (>
    0.76
    tarian
    0.69
    nered
    0.66
    xual
    0.63
     sexism
    0.62
    auld
    0.62
    uilt
    0.62
     compliments
    0.60
    ifiable
    0.60
    Act Density 0.321%

    No Known Activations