INDEX
    Explanations

    attends to paralleled concepts or categories marked by specific tokens from subsequent tokens that offer additional or complementary context

    New Auto-Interp
    Head Attr Weights
    0:0.34
    1:0.21
    2:0.11
    3:0.09
    4:0.05
    5:0.03
    6:0.06
    7:0.08
    Negative Logits
    böz
    -0.61
    satunya
    -0.59
    <h6>
    -0.57
    FFIX
    -0.57
     Davido
    -0.55
     للمعارف
    -0.55
     Mahat
    -0.54
    guten
    -0.53
    atchewan
    -0.53
    ونه
    -0.53
    POSITIVE LOGITS
     onOptions
    0.59
    اریخ
    0.55
    তথ্যসূত্র
    0.54
    TagHelper
    0.53
    ?”
    0.53
     translateY
    0.52
    camore
    0.51
     فريبيس
    0.50
    plak
    0.49
     Nestor
    0.48
    Act Density 0.328%

    No Known Activations