INDEX
Explanations
This neuron selectively activates on Portuguese terms for fires or burnings (e.g., words like “incêndio” or “queimadas urbanas”).
New Auto-Interp
Negative Logits
озв
-0.07
sulfate
-0.07
ПО
-0.07
debit
-0.06
biology
-0.06
nicos
-0.06
soft
-0.06
IGNORE
-0.06
berk
-0.06
hos
-0.06
POSITIVE LOGITS
蘭
0.07
.bias
0.07
Fire
0.07
wildfires
0.06
Farrell
0.06
)_
0.06
طلا
0.06
multitude
0.06
війни
0.06
TB
0.06
Activations Density 0.008%