INDEX
    Explanations

    agent, sentiment, ventricle

    New Auto-Interp
    Negative Logits
    1.52
    1.51
    1.42
    েন
    1.38
    1.34
     can
    1.33
    ेल
    1.28
    ا
    1.28
    ки
    1.25
    ター
    1.24
    POSITIVE LOGITS
     be
    1.16
    %
    1.13
    $)
    1.11
    ،
    1.09
     for
    1.09
    ine
    1.07
    $:
    1.07
     have
    1.06
    >
    1.06
    }^
    1.05
    Act Density 0.522%

    No Known Activations