INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tape
    -1.77
     Tape
    -1.60
     TAPE
    -1.41
    Tape
    -1.27
     tapes
    -1.20
    tape
    -1.19
     taped
    -1.15
     taping
    -0.98
    テープ
    -0.79
     gloves
    -0.77
    POSITIVE LOGITS
    ever
    0.71
    e
    0.69
    eed
    0.66
    y
    0.63
     EconPapers
    0.57
    work
    0.56
    writer
    0.56
    hand
    0.56
    front
    0.54
    ee
    0.54
    Act Density 1.464%

    No Known Activations