INDEX
    Explanations

    references to characters and titles of nobility or authority

    New Auto-Interp
    Negative Logits
     متعلقه
    -1.24
     itſelf
    -1.20
     الرياضيه
    -1.17
     himſelf
    -1.17
     themſelves
    -1.14
    ſelves
    -1.14
     صوتيه
    -1.14
     myſelf
    -1.08
     Theſe
    -1.08
    ValueStyle
    -1.07
    POSITIVE LOGITS
    ,
    0.59
    0.50
     did
    0.47
     k
    0.45
    <eos>
    0.44
    _
    0.43
     skar
    0.43
    0.43
     it
    0.43
    </b>
    0.43
    Act Density 0.100%

    No Known Activations