INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ̃
    -0.07
     Conference
    -0.07
     lidí
    -0.07
     opened
    -0.07
     пацієн
    -0.07
    iên
    -0.07
     assaulted
    -0.06
    acion
    -0.06
    ائف
    -0.06
     Message
    -0.06
    POSITIVE LOGITS
     worth
    0.12
     Worth
    0.10
    worth
    0.08
     wealth
    0.08
     prestige
    0.07
     Value
    0.07
     WORD
    0.07
    TH
    0.07
    orth
    0.07
    uffed
    0.07
    Act Density 0.014%

    No Known Activations