INDEX
Explanations
This neuron fires on literal string content inside quotation marks (e.g., the “Hello, World!” tokens).
New Auto-Interp
Negative Logits
received
-0.07
doses
-0.07
fix
-0.07
_keyword
-0.06
medals
-0.06
night
-0.06
dress
-0.06
-third
-0.06
bad
-0.06
brib
-0.06
POSITIVE LOGITS
DllImport
0.07
dalla
0.07
:");↵
0.07
skupiny
0.06
[DllImport
0.06
'>";↵
0.06
олод
0.06
olvency
0.06
ourcem
0.06
.isNotBlank
0.06
Activations Density 0.012%