INDEX
Explanations
phrases related to boundaries being crossed
references to crossing boundaries or lines, both literal and metaphorical
New Auto-Interp
Negative Logits
thood
-0.95
LESS
-0.74
FILE
-0.71
FORE
-0.69
DAY
-0.67
summary
-0.66
ummies
-0.64
alike
-0.64
manship
-0.63
larg
-0.63
POSITIVE LOGITS
threshold
1.35
boundary
1.03
boundaries
1.02
thresholds
1.02
wire
0.91
border
0.90
hurdle
0.88
bounds
0.87
fence
0.87
limit
0.86
Activations Density 0.102%