INDEX
Explanations
lines or sections of code that appear to follow a specific comment or formatting pattern
New Auto-Interp
Negative Logits
================
-0.75
abetes
-0.69
b
-0.67
-0.62
r
-0.61
e
-0.60
as
-0.60
er
-0.59
be
-0.59
is
-0.58
POSITIVE LOGITS
################
1.45
##
1.42
////////////////
1.36
']))
1.21
]")
1.20
])):
1.20
#########
1.18
########
1.17
##############
1.17
###########
1.16
Activations Density 0.090%