INDEX
    Explanations

    singular unit

    New Auto-Interp
    Negative Logits
    .yang
    -0.07
    wine
    -0.06
    286
    -0.06
    .new
    -0.06
    Width
    -0.06
     REL
    -0.06
    ΑΚ
    -0.06
    oui
    -0.06
    .observable
    -0.06
    -food
    -0.06
    POSITIVE LOGITS
     coc
    0.07
    panels
    0.06
    When
    0.06
     retard
    0.06
    structured
    0.06
     sinon
    0.06
    (...)
    0.06
    _ped
    0.06
    ديث
    0.06
     `↵
    0.06
    Act Density 0.061%

    No Known Activations