INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    yles
    -0.10
     preceding
    -0.10
     Kash
    -0.10
     predominantly
    -0.10
     distracting
    -0.09
     predominant
    -0.09
    izzo
    -0.09
    icha
    -0.09
     restrained
    -0.08
     Dixon
    -0.08
    POSITIVE LOGITS
    dom
    0.42
    Domin
    0.40
     дом
    0.39
     dom
    0.36
     Domin
    0.35
     domin
    0.35
     dominance
    0.31
     Dom
    0.29
     DOM
    0.29
     dominate
    0.27
    Act Density 0.186%

    No Known Activations