INDEX
    Explanations

    attends to the numeric value tokens surrounded by double square brackets from numeric value tokens surrounded by double asterisks

    New Auto-Interp
    Head Attr Weights
    0:0.19
    1:0.18
    2:0.19
    3:0.08
    4:0.08
    5:0.09
    6:0.06
    7:0.09
    Negative Logits
     Efq
    -0.59
     itſelf
    -0.55
     myſelf
    -0.53
     Monfieur
    -0.52
    ſelves
    -0.51
     nahilalakip
    -0.50
    __':
    
    -0.50
     Jefus
    -0.49
     pleaſure
    -0.49
     themſelves
    -0.49
    POSITIVE LOGITS
     l
    0.30
    ,
    0.27
    0.26
     L
    0.25
    0.24
     "
    0.24
     K
    0.23
     del
    0.23
     di
    0.22
     dell
    0.22
    Act Density 0.249%

    No Known Activations