INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (Bundle
    -0.08
    hle
    -0.08
    γο
    -0.07
     Bw
    -0.07
    zo
    -0.07
     traj
    -0.07
    (theta
    -0.07
    理念
    -0.07
    реж
    -0.07
     sjuk
    -0.07
    POSITIVE LOGITS
     سن
    0.08
    Neighbor
    0.08
    COMMENTS
    0.08
     constituent
    0.08
     vís
    0.08
    SET
    0.08
    .comments
    0.07
    Ranking
    0.07
     Dig
    0.07
    .comment
    0.07
    Act Density 0.004%

    No Known Activations