INDEX
    Explanations

    color-related elements or attributes in the text

    New Auto-Interp
    Negative Logits
    itters
    -0.08
    lef
    -0.08
    zon
    -0.07
    adena
    -0.07
    ERSHEY
    -0.06
    ISCO
    -0.06
    weit
    -0.06
     зави
    -0.06
    abez
    -0.06
    udden
    -0.06
    POSITIVE LOGITS
    edio
    0.07
    iva
    0.06
    ãģıãĤĵ
    0.06
     kicker
    0.06
    ary
    0.06
     Mag
    0.06
    red
    0.06
     gold
    0.06
    ìĥī
    0.06
     recon
    0.06
    Act Density 0.004%

    No Known Activations