INDEX
Explanations
distinctness/separation
The neuron detects mentions of regional linguistic or dialectal variations (e.g., differences in accent or local usage).
New Auto-Interp
Negative Logits
executive
-0.07
left
-0.07
killers
-0.07
imports
-0.07
sdk
-0.07
sense
-0.06
killing
-0.06
killed
-0.06
Killer
-0.06
ע
-0.06
POSITIVE LOGITS
Alpha
0.07
).\
0.06
.NonNull
0.06
stip
0.06
abis
0.06
itat
0.06
Clinic
0.06
PdfP
0.06
Magnus
0.06
Formatter
0.06
Activations Density 0.034%