INDEX
Explanations
expressions of frustration or anger related to social issues.
This neuron activates on occurrences of the word “damn,” i.e. the mild profanity/intensifier “damn.”
New Auto-Interp
Negative Logits
primitive
-0.07
Reality
-0.07
Flo
-0.07
Symphony
-0.07
Flo
-0.07
Wald
-0.06
cake
-0.06
ResourceBundle
-0.06
poi
-0.06
-floor
-0.06
POSITIVE LOGITS
damn
0.10
damned
0.09
Damn
0.08
Damn
0.08
denně
0.07
.Gr
0.07
dam
0.07
darn
0.06
झ
0.06
иму
0.06
Activations Density 0.003%