INDEX
Explanations
model performance
This neuron activates on words and phrases that talk about improving or measuring performance (e.g., “improve,” “performance,” “improving”).
New Auto-Interp
Negative Logits
áno
-0.06
Thumbnail
-0.06
zkou
-0.06
anlık
-0.06
salad
-0.06
estudio
-0.05
simulator
-0.05
giảng
-0.05
unar
-0.05
CDF
-0.05
POSITIVE LOGITS
''),
0.09
ceil
0.07
={},0.07
声
0.07
microseconds
0.07
alg
0.07
competing
0.07
Proceed
0.07
(hero
0.06
Raises
0.06
Activations Density 0.033%