INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Efq
    -1.20
     Jefus
    -1.16
     itſelf
    -1.12
     Wicidata
    -1.08
    %")
    -1.06
    bibfield
    -1.05
     purpoſe
    -1.05
    __':
    
    -1.04
    bibinfo
    -1.03
     myſelf
    -1.03
    POSITIVE LOGITS
    ↵↵
    0.70
    .
    0.69
    ,
    0.68
    0.66
    0.64
     in
    0.59
    B
    0.57
    \
    0.56
    "
    0.56
      
    0.55
    Act Density 0.112%

    No Known Activations