INDEX
Explanations
The neuron detects occurrences of the phrase “works for us” (or close variants expressing that something “works for us/both of us”).
New Auto-Interp
Negative Logits
stdout
-0.07
smash
-0.06
elt
-0.06
енную
-0.06
aces
-0.06
设
-0.06
MacBook
-0.06
Employ
-0.06
拓
-0.06
dances
-0.06
POSITIVE LOGITS
urnal
0.07
empirical
0.06
issional
0.06
fairness
0.06
(TABLE
0.06
JV
0.06
whatever
0.06
Rio
0.06
rooted
0.06
served
0.06
Activations Density 0.008%