INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    lt
    -0.26
    jt
    -0.25
    gt
    -0.24
    dt
    -0.23
    kt
    -0.22
    Dt
    -0.22
     Dt
    -0.22
    zt
    -0.22
    Lt
    -0.22
    rt
    -0.22
    POSITIVE LOGITS
     toasted
    0.19
    anto
    0.19
    ANTA
    0.17
     Tony
    0.16
    ANTI
    0.16
    rides
    0.16
    ento
    0.16
    onto
    0.16
    into
    0.16
    vester
    0.16
    Act Density 0.041%

    No Known Activations