INDEX
Explanations
The neuron flags strong profanity—especially multi‐word or intensified swears (e.g. “God damn,” “fuck,” “cunt”)—marking when highly offensive curse phrases occur.
discussions of profanity and offensive speech, including meta-talk about speaking style and advice or warnings around using such language.
New Auto-Interp
Negative Logits
운영
0.89
🔬
0.87
📈
0.84
🧪
0.83
bioinformatics
0.82
DeFi
0.78
ipynb
0.74
பணிகள்
0.74
OpenGL
0.73
📊
0.73
POSITIVE LOGITS
utterances
2.67
utterance
2.61
phrases
2.54
frases
2.33
verbal
2.30
phrasing
2.29
speech
2.27
uttered
2.26
phrase
2.18
words
2.15
Activations Density 2.924%