INDEX
    Explanations

    references to bias and its variations in context

    New Auto-Interp
    Negative Logits
    ä¼´
    -0.16
    (.)
    -0.15
    bine
    -0.14
    agar
    -0.14
    bio
    -0.14
    ῦ
    -0.14
    lw
    -0.13
     flesh
    -0.13
     NW
    -0.13
    ë°©
    -0.13
    POSITIVE LOGITS
    hetto
    0.17
    rif
    0.16
    ogg
    0.15
    ÑĢд
    0.15
    acz
    0.15
    emouth
    0.15
    æĪIJ人
    0.15
    forme
    0.15
    odash
    0.15
    .desktop
    0.14
    Act Density 0.015%

    No Known Activations