INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     onset
    -0.07
    _CONTENT
    -0.07
    razy
    -0.07
     EDGE
    -0.07
     числі
    -0.06
     Profes
    -0.06
    }'.
    -0.06
     Moreno
    -0.06
     drink
    -0.06
    Roman
    -0.06
    POSITIVE LOGITS
    [label
    0.07
    izable
    0.06
     +-
    0.06
    彩票
    0.06
    енную
    0.06
    702
    0.06
    74
    0.06
     Seeder
    0.06
    ắc
    0.06
    ivation
    0.06
    Act Density 0.000%

    No Known Activations