INDEX
    Explanations

    contrasting conjunctions or statements of consequence

    New Auto-Interp
    Negative Logits
     vastly
    0.73
     extremely
    0.66
     European
    0.63
    非常に
    0.62
     tổng
    0.61
     tốt
    0.60
    সামরিক
    0.60
     prejudices
    0.60
     life
    0.59
     various
    0.59
    POSITIVE LOGITS
     implying
    0.77
     אך
    0.77
     Implications
    0.71
     Conversely
    0.65
    但在
    0.63
    했지만
    0.62
     suggesting
    0.62
    是因為
    0.62
     ولكن
    0.61
     immunore
    0.61
    Act Density 0.029%

    No Known Activations