INDEX
    Explanations

    conjunctions and contrast

    New Auto-Interp
    Negative Logits
     также
    1.02
    overall
    1.01
     نیز
    1.00
     ise
    0.95
     également
    0.93
     أيضًا
    0.91
    较为
    0.91
     inoltre
    0.91
    示例
    0.90
    additional
    0.89
    POSITIVE LOGITS
     And
    2.18
     But
    1.82
    And
    1.57
     That
    1.44
     Or
    1.40
    1.38
     Not
    1.37
     Like
    1.33
     Nhưng
    1.28
     Maybe
    1.27
    Act Density 0.571%

    No Known Activations