INDEX
Explanations
syntactical structures and elements from programming code
New Auto-Interp
Negative Logits
II
-0.71
IV
-0.64
III
-0.63
Id
-0.53
Il
-0.51
Winaray
-0.50
principalTable
-0.50
brainly
-0.50
Ing
-0.46
ее
-0.45
POSITIVE LOGITS
i
2.53
i
1.64
𝑖
0.85
i
0.82
iT
0.79
iK
0.78
僕は
0.77
iM
0.76
iwa
0.76
iL
0.74
Activations Density 0.410%