INDEX
    Explanations

    color, pink, red

    New Auto-Interp
    Negative Logits
     rotate
    -0.07
    -0.07
     Ix
    -0.07
    _rotate
    -0.07
     vele
    -0.07
     показ
    -0.07
     Permission
    -0.07
     Diamond
    -0.07
     written
    -0.07
     caused
    -0.07
    POSITIVE LOGITS
     lob
    0.09
     acknowledging
    0.08
    Acknowled
    0.08
    Laur
    0.08
     vendedores
    0.08
     assistência
    0.07
     Congressional
    0.07
    ereka
    0.07
    辅助
    0.07
    uthuk
    0.07
    Act Density 0.009%

    No Known Activations