INDEX
Explanations
The neuron fires on occurrences of the word “expense” (and closely related expense‐tracking terms).
New Auto-Interp
Negative Logits
weaving
-0.07
donor
-0.07
ideal
-0.07
cycling
-0.07
cycle
-0.07
sits
-0.07
olvable
-0.06
molecule
-0.06
Magnet
-0.06
bottle
-0.06
POSITIVE LOGITS
expenses
0.14
Expenses
0.14
expense
0.12
expenses
0.11
expense
0.10
Expense
0.10
Expense
0.10
расход
0.08
еж
0.07
แผ
0.07
Activations Density 0.005%