INDEX
    Explanations

    news articles

    New Auto-Interp
    Negative Logits
     sabotage
    -0.07
    ymbol
    -0.06
    POINT
    -0.06
     Vehicle
    -0.06
    simp
    -0.06
     Object
    -0.06
     WT
    -0.06
    ред
    -0.06
     ideological
    -0.06
     ASC
    -0.06
    POSITIVE LOGITS
     songwriter
    0.07
    GraphQL
    0.06
    0.06
     zeit
    0.06
    0.06
     deutsch
    0.06
    0.06
     Portrait
    0.06
    оя
    0.06
     Bulg
    0.06
    Act Density 0.029%

    No Known Activations