INDEX
    Explanations

    manipulation and conflict resolution

    New Auto-Interp
    Negative Logits
     extravaganza
    1.26
     отличие
    1.08
     Although
    1.07
    Although
    1.06
     Yani
    1.04
     虽然
    1.03
     foremost
    1.02
     yaitu
    1.01
     although
    1.01
     offenbar
    1.00
    POSITIVE LOGITS
     någon
    0.99
     einer
    0.93
     zusätzliche
    0.86
     einem
    0.85
    𝚊
    0.83
    其它
    0.81
     інших
    0.80
    ോട്ടോ
    0.77
     bárm
    0.76
    zeitig
    0.74
    Act Density 1.666%

    No Known Activations