INDEX
    Explanations

    instances of multi-token sequences or phrases

    New Auto-Interp
    Negative Logits
     lead
    -0.35
     torch
    -0.33
     kain
    -0.32
     Lead
    -0.31
     man
    -0.31
     sper
    -0.30
    тро
    -0.30
    tr
    -0.30
     trans
    -0.29
     sab
    -0.29
    POSITIVE LOGITS
    tagHelperRunner
    0.82
    fjspx
    0.72
     NSCoder
    0.72
     &___
    0.69
    parsedMessage
    0.68
     يتيمه
    0.62
     fieldNum
    0.61
     للاسماء
    0.61
    ArgsConstructor
    0.60
    setVerticalGroup
    0.59
    Act Density 0.003%

    No Known Activations