INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ertia
    -0.07
    لت
    -0.07
     влади
    -0.06
    -0.06
    -0.06
    cessive
    -0.06
    prt
    -0.06
    -0.06
    _i
    -0.06
     vlan
    -0.06
    POSITIVE LOGITS
     the
    0.10
    —the
    0.10
     The
    0.09
    The
    0.09
    ,the
    0.08
    the
    0.08
     THE
    0.08
    .The
    0.07
    -the
    0.07
    .the
    0.07
    Act Density 0.051%

    No Known Activations