INDEX
Explanations
The neuron is triggered by occurrences of the substring “test” (in various contexts, whether standalone or embedded in larger words).
New Auto-Interp
Negative Logits
Around
-0.07
smlou
-0.06
king
-0.06
around
-0.06
smlouvy
-0.06
схем
-0.06
.crop
-0.06
hors
-0.06
brigade
-0.06
chocol
-0.06
POSITIVE LOGITS
Test
0.15
test
0.15
Test
0.14
etest
0.11
.test
0.11
test
0.11
TEST
0.11
tests
0.11
test
0.10
Testing
0.10
Activations Density 0.068%