INDEX
Explanations
prestigious
The neuron selectively detects the adjective “prestigious,” marking mentions of high-status or honorific contexts.
New Auto-Interp
Negative Logits
sand
-0.07
mir
-0.06
Love
-0.06
-book
-0.06
-or
-0.06
comic
-0.06
Polynomial
-0.06
SIM
-0.06
WND
-0.06
eyond
-0.06
POSITIVE LOGITS
prestigious
0.14
prestige
0.11
distinguished
0.09
Prest
0.09
тех
0.08
NEC
0.07
STONE
0.07
::_('0.07
istinguished
0.07
rest
0.07
Activations Density 0.004%