INDEX
Explanations
The neuron activates on subjective-judgment phrases that frame something as “makes no sense to…” or “more important to…,” i.e. comparative or opinionated constructions.
New Auto-Interp
Negative Logits
Workplace
-0.07
KG
-0.07
bedside
-0.07
climbs
-0.07
amburg
-0.07
owitz
-0.06
106
-0.06
Dmit
-0.06
оже
-0.06
rays
-0.06
POSITIVE LOGITS
�
0.06
settlers
0.06
etk
0.06
ड़
0.06
presenting
0.06
SUR
0.06
Establish
0.06
flere
0.06
enumerate
0.06
activations
0.06
Activations Density 0.036%