INDEX
Explanations
This neuron detects words that express evaluation of quality or performance, such as “adequacy” and “effectiveness.”
New Auto-Interp
Negative Logits
bait
-0.07
(chalk
-0.07
096
-0.06
494
-0.06
Low
-0.06
Accessible
-0.06
corrupted
-0.06
fossil
-0.06
updateTime
-0.06
mousedown
-0.06
POSITIVE LOGITS
Việc
0.07
ีช
0.07
último
0.07
ुलन
0.07
.'));↵
0.06
orang
0.06
akan
0.06
keras
0.06
удал
0.06
Nẵng
0.06
Activations Density 0.085%