INDEX
    Explanations

    proper nouns related to different people and places

    words related to organizational roles or actions

    New Auto-Interp
    Negative Logits
     itself
    -0.67
     Xer
    -0.63
    imgur
    -0.62
    =/
    -0.59
    una
    -0.56
    itiz
    -0.55
     relies
    -0.55
     lasts
    -0.54
     doesnt
    -0.54
     fav
    -0.53
    POSITIVE LOGITS
     respectively
    1.76
     respective
    1.26
     apiece
    1.18
     jointly
    1.10
     collectively
    0.97
     together
    0.96
     Together
    0.87
    together
    0.86
     themselves
    0.84
    selves
    0.83
    Act Density 1.089%

    No Known Activations