INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     leſs
    -0.77
     ſtate
    -0.68
     ſche
    -0.66
     laſt
    -0.65
     leaſt
    -0.64
     ſy
    -0.64
     pleaſure
    -0.64
    leſs
    -0.64
     ſeveral
    -0.63
     ſa
    -0.61
    POSITIVE LOGITS
    com
    1.05
    Com
    0.77
     Com
    0.72
    COM
    0.71
     com
    0.69
     COM
    0.65
    coms
    0.61
    コム
    0.61
    m
    0.59
     Comstock
    0.57
    Act Density 0.081%

    No Known Activations