INDEX
Explanations
references to the word "gate"
references to gates or barriers, indicating a focus on physical or metaphorical entry points
New Auto-Interp
Negative Logits
subp
-0.73
constitu
-0.66
own
-0.65
skilled
-0.65
specialized
-0.64
blueprint
-0.62
inances
-0.61
>>>>>>>>
-0.58
marrow
-0.57
ynt
-0.57
POSITIVE LOGITS
gate
1.49
Gate
1.11
way
0.93
boro
0.87
ardless
0.86
pole
0.85
ways
0.83
cloth
0.83
math
0.82
watch
0.82
Activations Density 0.008%