INDEX
    Explanations

    sports teams and collectibles

    New Auto-Interp
    Negative Logits
    TokenNameEQUAL
    -0.86
     humanos
    -0.85
     INSPIRE
    -0.74
     screenshots
    -0.69
    ンダル
    -0.69
    Согласно
    -0.69
    jectures
    -0.68
    埃及
    -0.68
     pata
    -0.67
     malfunction
    -0.66
    POSITIVE LOGITS
     cards
    1.01
     card
    0.96
     Pose
    0.85
     backs
    0.84
     tobacco
    0.81
     poses
    0.81
    Pose
    0.80
    expressions
    0.80
    HORIZONTAL
    0.80
     borders
    0.79
    Act Density 0.006%

    No Known Activations