INDEX
Explanations
numeric identifiers or values in a structured format
Tokens after a capitalized word
Fort, Marcus, Share, Project, Marie
New Auto-Interp
Negative Logits
SharedDtor
-0.94
✨:
-0.85
HasAnnotation
-0.83
aarrggbb
-0.81
utafitiHapana
-0.80
GenerationType
-0.80
yntaxException
-0.80
dflare
-0.79
חיצוניים
-0.79
ویکیپدیای
-0.79
POSITIVE LOGITS
0.80
4
0.77
2
0.74
0
0.74
5
0.72
1
0.72
3
0.71
7
0.70
9
0.69
8
0.69
Activations Density 1.135%