INDEX
Explanations
references to specific individuals or titles.
The neuron detects indefinite noun‐phrase introductions used to characterize a person or group (i.e. “a [entity] …” with descriptive modifiers).
New Auto-Interp
Negative Logits
softly
-0.07
ara
-0.07
-flight
-0.06
ez
-0.06
successfully
-0.06
(Level
-0.06
lace
-0.06
rances
-0.06
riders
-0.06
embr
-0.06
POSITIVE LOGITS
Nimbus
0.07
$__
0.06
Assertion
0.06
注
0.06
产业
0.06
pohled
0.06
річ
0.06
↵↵
0.06
']*
0.06
Stat
0.06
Activations Density 0.074%