INDEX
Explanations
dimension
The neuron specifically detects mentions of the word “dimension” (including its subword forms like “dimensionality,” “dimensionally,” etc.).
New Auto-Interp
Negative Logits
Ou
-0.07
boat
-0.06
So
-0.06
_combo
-0.06
可
-0.06
Poe
-0.06
698
-0.06
Ho
-0.06
ya
-0.06
929
-0.06
POSITIVE LOGITS
dimension
0.14
Dimension
0.13
dimensions
0.13
dimension
0.12
Dimensions
0.11
Dimension
0.11
_dimension
0.10
dimensions
0.09
Dimensions
0.09
imension
0.09
Activations Density 0.013%