INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     interact
    -0.07
     main
    -0.07
     occupancy
    -0.07
     Vertical
    -0.07
     euros
    -0.07
     aura
    -0.06
     folds
    -0.06
    Bone
    -0.06
    -0.06
     площад
    -0.06
    POSITIVE LOGITS
    :+
    0.07
    426
    0.07
    .')↵
    0.07
    !!)↵
    0.06
    STYPE
    0.06
    .")↵
    0.06
     appropriately
    0.06
     …↵
    0.06
    лишком
    0.06
     свого
    0.06
    Act Density 0.062%

    No Known Activations