INDEX
    Explanations

    multi-head attention queries

    New Auto-Interp
    Negative Logits
    лт
    0.44
     लालू
    0.42
    ccnc
    0.42
     vutta
    0.42
    0.42
    captcha
    0.41
     कप
    0.41
    endeu
    0.40
    нению
    0.40
     постепен
    0.40
    POSITIVE LOGITS
     scaled
    0.61
     Queries
    0.61
     queries
    0.57
    Queries
    0.55
     query
    0.54
    Scaled
    0.53
    queries
    0.53
    Query
    0.52
     Query
    0.51
     Scal
    0.49
    Act Density 0.020%

    No Known Activations