INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     державної
    -0.07
     linear
    -0.07
    ={['
    -0.07
    -0.07
     grandfather
    -0.06
     position
    -0.06
     λ
    -0.06
    ां
    -0.06
     standings
    -0.06
     useClass
    -0.06
    POSITIVE LOGITS
    -game
    0.06
    -outline
    0.06
    mates
    0.06
     Gore
    0.06
    _hook
    0.06
    zahl
    0.06
     atrib
    0.06
     tal
    0.06
    ARRANT
    0.06
     vigorously
    0.06
    Act Density 0.013%

    No Known Activations