INDEX
    Explanations

    terms related to achievements or qualities of individuals

    terms related to influence, success, and authority in various contexts

    New Auto-Interp
    Negative Logits
    olulu
    -0.74
     oneself
    -0.74
    orks
    -0.63
     ourselves
    -0.63
    $.
    -0.62
    amiya
    -0.60
    common
    -0.60
     Ga
    -0.59
    nant
    -0.58
    berra
    -0.58
    POSITIVE LOGITS
     differs
    0.95
     consisted
    0.93
     extends
    0.92
     consists
    0.87
     encompasses
    0.84
     woes
    0.83
     differed
    0.81
     revolves
    0.80
     shines
    0.78
     coincided
    0.78
    Act Density 0.420%

    No Known Activations