INDEX
Explanations
references to built-in components or features in technology or software
New Auto-Interp
Negative Logits
مشين
-0.71
ValueStyle
-0.66
twimg
-0.65
permitAll
-0.56
WriteTagHelper
-0.56
שוליים
-0.55
متعلقه
-0.54
elemField
-0.52
بوابة
-0.51
AndEndTag
-0.49
POSITIVE LOGITS
builtin
0.95
intrinsic
0.92
intrinsic
0.89
endogenous
0.88
Intrinsic
0.81
内置
0.80
builtin
0.78
inherent
0.77
native
0.77
Intrinsic
0.76
Activations Density 0.400%