INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    yse
    1.12
    ل
    1.04
    𝗧
    1.02
    𝗱
    0.98
    𝘀
    0.97
    БО
    0.95
    ه
    0.94
    0.94
    ТО
    0.93
    𝗹
    0.92
    POSITIVE LOGITS
     structs
    1.00
    8
    0.99
     lenders
    0.93
     fonts
    0.93
    om
    0.92
     offences
    0.91
    ar
    0.89
     edges
    0.89
     precautions
    0.89
    7
    0.89
    Act Density 0.204%

    No Known Activations