INDEX
Explanations
The neuron is looking for occurrences of the word "while"
the word "while" in various contexts
New Auto-Interp
Negative Logits
emb
-0.70
backs
-0.69
arranged
-0.67
iron
-0.66
fields
-0.66
organized
-0.66
organised
-0.65
legend
-0.65
Ireland
-0.65
coming
-0.65
POSITIVE LOGITS
while
3.67
theless
1.17
during
1.13
instead
1.09
until
1.05
continue
1.04
soDeliveryDate
1.03
despite
1.01
lest
1.00
before
1.00
Activations Density 0.022%