INDEX
    Explanations

    the words "kind" or "sort."

    New Auto-Interp
    Negative Logits
     itſelf
    -1.30
    ſelf
    -1.28
     myſelf
    -1.26
    InjectAttribute
    -1.25
     تانيه
    -1.25
     שוליים
    -1.23
    ArgsConstructor
    -1.22
    ſelves
    -1.20
     ―――――
    -1.19
     ſtate
    -1.17
    POSITIVE LOGITS
    0.90
    <eos>
    0.82
    1
    0.80
    ↵↵
    0.79
    0.77
    .
    0.76
    4
    0.72
     .
    0.71
     A
    0.71
      
    0.68
    Act Density 0.548%

    No Known Activations