INDEX
Explanations
The neuron selectively activates on occurrences of “key” (and its plural form) in the text.
New Auto-Interp
Negative Logits
_clause
-0.07
_Number
-0.06
stial
-0.06
oufl
-0.06
ふ
-0.06
์)
-0.06
HorizontalAlignment
-0.06
Ips
-0.06
庄
-0.06
首页
-0.06
POSITIVE LOGITS
Респ
0.07
cheesy
0.07
poorer
0.07
py
0.06
staircase
0.06
noteworthy
0.06
courtyard
0.06
platinum
0.06
tahun
0.06
verdi
0.06
Activations Density 0.005%