INDEX
    Explanations

    comma-separated items within a list

    New Auto-Interp
    Negative Logits
    worldly
    -0.68
    ESE
    -0.65
    Different
    -0.62
    ¬¼
    -0.62
     (>
    -0.61
    ~~~~
    -0.58
    Same
    -0.57
    STEM
    -0.57
    âĶģ
    -0.57
     ',
    -0.57
    POSITIVE LOGITS
     joins
    1.14
     believes
    1.14
     testified
    1.11
     has
    1.10
     insists
    1.09
     denies
    1.08
     withdrew
    1.07
     remembers
    1.07
     admits
    1.06
     says
    1.06
    Act Density 0.179%

    No Known Activations