INDEX
    Explanations

    building efficient transformers

    New Auto-Interp
    Negative Logits
     persuaded
    0.85
     ጥላ
    0.82
     convincingly
    0.80
     indicated
    0.79
     depicted
    0.79
     liable
    0.79
    串口
    0.78
     bilirubin
    0.78
     dissipated
    0.77
     carboxyl
    0.77
    POSITIVE LOGITS
    iendo
    0.93
    maq
    0.83
    saf
    0.82
    s
    0.82
    safe
    0.82
    sch
    0.81
    sz
    0.81
    ن
    0.81
    sar
    0.80
    savvy
    0.79
    Act Density 0.002%

    No Known Activations