INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     but
    0.67
     But
    0.58
     ولكن
    0.57
     BUT
    0.55
     αλλά
    0.55
    0.54
     لیکن
    0.54
     אבל
    0.53
     लेकिन
    0.52
     لكن
    0.51
    POSITIVE LOGITS
    ;
    1.47
    1.35
    *;
    1.29
    ؛
    1.25
    $;
    1.24
    }$;
    1.20
    ();
    1.16
    -;
    1.14
    °;
    1.11
    _;
    1.09
    Act Density 0.208%

    No Known Activations