INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Bubble
    -0.08
     Salary
    -0.08
    Bubble
    -0.07
    _sleep
    -0.07
    Instrument
    -0.07
    Salary
    -0.07
     Musical
    -0.07
     instrument
    -0.07
    Seal
    -0.07
     bullying
    -0.07
    POSITIVE LOGITS
     используется
    0.08
     tells
    0.08
     särsk
    0.08
     allows
    0.08
     dispõe
    0.07
    ’all
    0.07
     dadas
    0.07
     іст
    0.07
     mada
    0.07
     ada
    0.07
    Act Density 0.570%

    No Known Activations