INDEX
Explanations
Legal disclaimers
The neuron detects legal fault‐and‐liability terms—especially words like “misconduct,” “negligence,” “intentional,” or “fraudulent” that signal wrongdoing or liability.
New Auto-Interp
Negative Logits
linger
-0.06
submar
-0.06
.TextAlignment
-0.06
tile
-0.06
lifecycle
-0.06
Increasing
-0.06
signal
-0.06
tr
-0.06
mock
-0.06
Tiles
-0.06
POSITIVE LOGITS
resas
0.08
Güney
0.07
dot
0.07
ignty
0.06
Dot
0.06
rox
0.06
exc
0.06
confidential
0.06
meslek
0.06
emploi
0.06
Activations Density 0.004%