INDEX
Explanations
proper nouns, capitalized words
The neuron reliably detects mentions of “federated learning.”
New Auto-Interp
Negative Logits
.pow
-0.07
.Π
-0.06
ialog
-0.06
while
-0.06
panies
-0.06
Нат
-0.06
ccount
-0.06
.Split
-0.06
.addr
-0.06
çu
-0.06
POSITIVE LOGITS
ang
0.07
=R
0.07
Ф
0.07
189
0.07
arreglo
0.06
)'),
0.06
hamburger
0.06
uffed
0.06
>>(↵
0.06
“,
0.06
Activations Density 0.051%