INDEX
Explanations
syntax-related elements in programming code
New Auto-Interp
Negative Logits
ฝ
-0.67
udad
-0.66
igold
-0.63
oot
-0.63
gger
-0.63
Wib
-0.63
capital
-0.62
rams
-0.60
archical
-0.60
peg
-0.60
POSITIVE LOGITS
__":
1.93
__':
1.91
__':
1.87
__":
1.86
"])){1.26
}))
1.24
'])){1.16
))){1.12
الحره
1.11
}))
1.08
Activations Density 0.023%