INDEX
    Explanations

    instances of passive voice or reports of past actions

    New Auto-Interp
    Negative Logits
    ariat
    -0.06
    inspace
    -0.06
    管
    -0.06
     mau
    -0.06
    nees
    -0.06
    ari
    -0.06
     ing
    -0.06
    .synthetic
    -0.05
    ire
    -0.05
    640
    -0.05
    POSITIVE LOGITS
    cela
    0.09
    ãĥ¼ãĥĭ
    0.07
    ãģ£ãģı
    0.07
    'gc
    0.07
    اباÙĨ
    0.06
     خاص
    0.06
    ozem
    0.06
    unsch
    0.06
    มาย
    0.06
    ì§Ī
    0.06
    Act Density 0.034%

    No Known Activations