INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ippo
    -0.14
    rito
    -0.14
    avenport
    -0.14
    utherford
    -0.14
     Writers
    -0.14
    ulet
    -0.14
    ño
    -0.13
     Hitch
    -0.13
     meant
    -0.13
    ichi
    -0.13
    POSITIVE LOGITS
    ://
    0.18
    flip
    0.15
    atego
    0.15
     }};↵
    0.15
    .criteria
    0.15
    odzi
    0.14
     Hod
    0.14
    alim
    0.14
    inati
    0.14
    Assignable
    0.14
    Act Density 0.024%

    No Known Activations