INDEX
Explanations
model performance evaluation
This neuron activates on words referring to class imbalance or unbalanced datasets.
New Auto-Interp
Negative Logits
008
-0.07
reaction
-0.06
memories
-0.06
branches
-0.06
Gaussian
-0.06
ArrayType
-0.06
inally
-0.06
MART
-0.06
reactions
-0.06
,args
-0.06
POSITIVE LOGITS
Something
0.07
(Contact
0.07
volumes
0.06
verb
0.06
曾
0.06
rejects
0.06
VB
0.06
anneer
0.06
subroutine
0.06
niž
0.06
Activations Density 0.006%