INDEX
Explanations
URLs and web-related resources
New Auto-Interp
Negative Logits
AndEndTag
-1.02
iſt
-1.01
―――――
-0.96
^(@)
-0.94
་་
-0.92
myſelf
-0.86
dieß
-0.86
itſelf
-0.85
themſelves
-0.84
auffi
-0.83
POSITIVE LOGITS
w
3.11
w
2.83
W
2.26
W
2.23
𝙬
1.25
𝐰
1.19
𝘄
1.18
𝒘
1.11
ᴡ
1.10
𝑤
1.09
Activations Density 0.133%