INDEX
Explanations
The main thing this neuron does is find words related to advertisements or promotions
the presence of the abbreviation "Ad" in various contexts
New Auto-Interp
Negative Logits
Kubrick
-0.69
Jr
-0.68
trout
-0.68
Stronghold
-0.67
Weir
-0.66
crab
-0.63
biomass
-0.61
wool
-0.61
Tsukuyomi
-0.60
Bulg
-0.60
POSITIVE LOGITS
vertis
1.41
mittedly
1.33
olescent
1.31
elaide
1.28
vance
1.24
vantage
1.23
olesc
1.21
vertising
1.21
ventures
1.20
ept
1.20
Activations Density 0.027%