INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     زی
    -0.07
    /cli
    -0.07
    ujícím
    -0.06
     Reeves
    -0.06
    	cli
    -0.06
     cave
    -0.06
     desi
    -0.06
     waves
    -0.06
     ва
    -0.06
    	conf
    -0.06
    POSITIVE LOGITS
     honor
    0.11
     honour
    0.10
     honored
    0.09
     honors
    0.09
     honorary
    0.08
    Hon
    0.07
     honoring
    0.07
    olor
    0.07
     Honor
    0.07
     honorable
    0.07
    Act Density 0.010%

    No Known Activations