INDEX
    Explanations

    binding phrases that emphasize connection and relational understanding

    New Auto-Interp
    Negative Logits
    usting
    -0.17
    ndern
    -0.15
     PIL
    -0.15
    ij
    -0.15
     Pil
    -0.15
    ợi
    -0.15
    .AF
    -0.14
    ftware
    -0.14
    ะ
    -0.14
    icc
    -0.13
    POSITIVE LOGITS
    esel
    0.15
     dr
    0.14
     Offensive
    0.14
     diff
    0.14
    _PATCH
    0.14
     Ens
    0.14
    FILTER
    0.14
     Fab
    0.14
     ens
    0.13
    ToLocal
    0.13
    Act Density 0.013%

    No Known Activations