INDEX
Explanations
informational content related to historical or cultural documents.
This neuron strongly activates on anonymized placeholder tokens (like “NAME_1”, “NAME_2”, etc.), i.e. redacted name‐entity markers.
New Auto-Interp
Negative Logits
axes
-0.07
avi
-0.06
Waves
-0.06
Flowers
-0.06
<Player
-0.06
escal
-0.06
]].
-0.06
Ok
-0.06
Jobs
-0.06
flowers
-0.06
POSITIVE LOGITS
TagName
0.07
consoles
0.06
.Root
0.06
.uint
0.06
četně
0.06
बच
0.06
dapat
0.06
.Metro
0.06
'util
0.06
unmanned
0.06
Activations Density 0.012%