INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     posture
    -0.07
    _models
    -0.07
     VP
    -0.07
     coat
    -0.07
     moves
    -0.06
     qry
    -0.06
     Feb
    -0.06
    share
    -0.06
     Fern
    -0.06
     ventures
    -0.06
    POSITIVE LOGITS
     giden
    0.07
     simil
    0.06
    лина
    0.06
    ,t
    0.06
    prehensive
    0.06
     Peach
    0.06
    ported
    0.06
    ●●●●
    0.06
    .setToolTipText
    0.06
     внимание
    0.06
    Act Density 0.009%

    No Known Activations