INDEX
Explanations
The neuron spotlights mentions of the model’s training data (and related training contexts).
New Auto-Interp
Negative Logits
patrol
-0.07
Terrain
-0.06
//****************************************************************
-0.06
era
-0.06
teen
-0.06
Justin
-0.06
OnDestroy
-0.06
-other
-0.06
autopsy
-0.06
Met
-0.06
POSITIVE LOGITS
cess
0.07
CRM
0.07
acji
0.07
vacc
0.06
calm
0.06
(glm
0.06
catches
0.06
,void
0.06
*)
0.06
daq
0.06
Activations Density 0.012%