INDEX
    Explanations

    punctuation

    New Auto-Interp
    Negative Logits
     d
    -0.53
     miss
    -0.52
     po
    -0.52
    zuführen
    -0.52
     a
    -0.51
     n
    -0.49
     m
    -0.49
     to
    -0.48
     me
    -0.48
     i
    -0.47
    POSITIVE LOGITS
     itſelf
    0.88
     auroit
    0.82
     feroit
    0.78
     avoient
    0.77
    ſelf
    0.75
     raiſ
    0.75
     Reſ
    0.74
     pouvoit
    0.74
     ་་
    0.74
     auffi
    0.73
    Act Density 0.024%

    No Known Activations