INDEX
Explanations
occurrences of programming or code-related structures
New Auto-Interp
Negative Logits
classNames
-0.77
grá
-0.77
mlen
-0.74
庄
-0.74
𝐞
-0.73
𝐥
-0.72
𝐮
-0.70
ActionBar
-0.70
Thra
-0.70
n
-0.69
POSITIVE LOGITS
}}$,
1.27
\}$,
1.26
])));
1.21
)}$,
1.18
}}"></
1.17
}}$,
1.14
})));
1.14
}}$.
1.13
]$.
1.13
})$,
1.12
Activations Density 0.204%