INDEX
    Explanations

    articles/prepositions

    New Auto-Interp
    Negative Logits
    ILD
    -0.07
    dy
    -0.07
    Button
    -0.06
     //@
    -0.06
     biodiversity
    -0.06
    Pl
    -0.06
    tribution
    -0.06
     physics
    -0.06
    aptic
    -0.06
     token
    -0.06
    POSITIVE LOGITS
     ευ
    0.07
    .')↵↵
    0.06
     ----------------------------------------------------------------
    0.06
    希望
    0.06
    .shiro
    0.06
    "?↵↵
    0.06
     &↵
    0.06
    &↵
    0.06
    assembly
    0.06
     війни
    0.06
    Act Density 0.011%

    No Known Activations