INDEX
    Explanations

    URLs and web-related resources

    New Auto-Interp
    Negative Logits
    AndEndTag
    -1.02
     iſt
    -1.01
     ―――――
    -0.96
    ^(@)
    -0.94
     ་་
    -0.92
     myſelf
    -0.86
     dieß
    -0.86
     itſelf
    -0.85
     themſelves
    -0.84
     auffi
    -0.83
    POSITIVE LOGITS
    w
    3.11
     w
    2.83
    W
    2.26
     W
    2.23
    𝙬
    1.25
    𝐰
    1.19
    𝘄
    1.18
    𝒘
    1.11
    1.10
    𝑤
    1.09
    Act Density 0.133%

    No Known Activations