INDEX
Explanations
book/movie summaries
This neuron activates most strongly on the opening, character‐introduction clauses—i.e. phrases that name a person and state who they are or what they do (name + copula + role/attribute).
New Auto-Interp
Negative Logits
cedures
-0.07
seeds
-0.07
Price
-0.07
cost
-0.07
Mark
-0.06
advisory
-0.06
Jonathan
-0.06
incarceration
-0.06
seed
-0.06
ических
-0.06
POSITIVE LOGITS
использовать
0.07
jas
0.06
лять
0.06
//
0.06
Engine
0.06
assword
0.06
_CFG
0.06
wireless
0.06
ﺍ
0.06
่อง
0.06
Activations Density 0.053%