INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Harness
    -0.48
     clef
    -0.46
     Slay
    -0.45
     Receipt
    -0.45
    czyna
    -0.44
     alve
    -0.43
    Footnote
    -0.41
     réguli
    -0.41
     Beet
    -0.41
    tenis
    -0.41
    POSITIVE LOGITS
    <bos>
    3.28
    __':
    
    0.90
    __":
    
    0.77
    /**
    0.70
    ']>
    0.70
    '
    0.68
     })}
    0.66
    '}>
    0.65
    #
    0.64
    ')):
    0.64
    Act Density 0.640%

    No Known Activations