INDEX
    Explanations

    sentences that express a strong opinion or judgment

    New Auto-Interp
    Negative Logits
    Trait
    -0.07
    erah
    -0.07
     downside
    -0.07
     albeit
    -0.07
    ansi
    -0.07
    arro
    -0.07
    raya
    -0.06
    âĸ¡
    -0.06
     although
    -0.06
    azes
    -0.06
    POSITIVE LOGITS
     nor
    0.11
     but
    0.11
     But
    0.09
     Nor
    0.09
    but
    0.08
    ï¼Įä½Ĩ
    0.08
     maar
    0.08
    But
    0.07
     но
    0.07
     íķĺì§Ģë§Į
    0.07
    Act Density 0.028%

    No Known Activations