INDEX
Explanations
hypo/hyper conditions
The neuron activates on words beginning with “hypo-” (e.g. hypoxia, hypotension, hypoglycemia).
New Auto-Interp
Negative Logits
ten
-0.08
olders
-0.08
renters
-0.08
resin
-0.08
lance
-0.08
replic
-0.07
dresser
-0.07
OSE
-0.07
caliber
-0.07
eight
-0.07
POSITIVE LOGITS
hyp
0.12
hyp
0.11
Hyp
0.10
hypocrisy
0.08
(
0.07
гід
0.07
.Top
0.07
v
0.06
loophole
0.06
hypothetical
0.06
Activations Density 0.011%