INDEX
Explanations
The neuron fires on words that signal physical harm, danger, or injury.
New Auto-Interp
Negative Logits
desarrollo
-0.06
نگ
-0.06
الشم
-0.06
يد
-0.06
:path
-0.06
ustainable
-0.06
iman
-0.06
convertible
-0.06
Роз
-0.05
uyễn
-0.05
POSITIVE LOGITS
BTTagCompound
0.07
graceful
0.07
.functions
0.07
erus
0.06
phenotype
0.06
.*(
0.06
(batch
0.06
-big
0.06
.':
0.06
_LINK
0.06
Activations Density 0.022%