INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (Config
    -0.06
    _gap
    -0.06
     Manitoba
    -0.06
     ------------------------------------------------------------------------↵
    -0.06
     lobbyist
    -0.06
    :@"
    -0.06
    DataRow
    -0.06
    wap
    -0.06
    Expansion
    -0.05
     чит
    -0.05
    POSITIVE LOGITS
    -popup
    0.07
    .ReLU
    0.07
    сторія
    0.07
    roveň
    0.07
     oh
    0.06
    ğiniz
    0.06
     rall
    0.06
     После
    0.06
     Fer
    0.06
     школи
    0.06
    Act Density 0.089%

    No Known Activations