INDEX
Explanations
The neuron activates on occurrences of the word “shield” (including its morphological variants like “shielded” or “shielding”).
New Auto-Interp
Negative Logits
time
-0.07
(date
-0.07
�
-0.07
Time
-0.07
Become
-0.07
�
-0.07
aturday
-0.07
Numer
-0.07
Vari
-0.07
ago
-0.06
POSITIVE LOGITS
shield
0.17
Shield
0.15
Shield
0.13
shields
0.12
Shields
0.10
shielding
0.09
shield
0.08
Nichols
0.08
chip
0.07
////////////////////////////////////////////////////
0.07
Activations Density 0.003%