INDEX
    Explanations

    expressions of contrasting ideas or conditions

    New Auto-Interp
    Negative Logits
     Ost
    -0.15
    lesi
    -0.14
     Wass
    -0.13
    InterfaceOrientation
    -0.13
    å¹³
    -0.13
    Ł
    -0.13
     verte
    -0.13
    ÑĢоÑī
    -0.13
    æīĭãĤĴ
    -0.13
     Mut
    -0.13
    POSITIVE LOGITS
    ters
    0.18
    è¿ĺæĺ¯
    0.17
     Still
    0.17
     still
    0.17
    Still
    0.16
     nevertheless
    0.16
     ìŬ
    0.15
    izzo
    0.15
     Nevertheless
    0.14
    still
    0.14
    Act Density 0.184%

    No Known Activations