INDEX
Explanations
The neuron activates on the word “metric” or “metrics,” identifying mentions of that term.
New Auto-Interp
Negative Logits
-unit
-0.07
"a
-0.07
ashamed
-0.07
Noah
-0.07
aw
-0.06
straw
-0.06
Yaz
-0.06
alphabet
-0.06
άνα
-0.06
166
-0.06
POSITIVE LOGITS
metrics
0.11
metrics
0.10
metric
0.09
Metric
0.09
_metric
0.09
Metrics
0.09
Metrics
0.09
Metric
0.09
mic
0.08
_metrics
0.07
Activations Density 0.006%