INDEX
Explanations
The neuron activates on technical NLP jargon and methodology terms (e.g. “coreferential,” “grammatical,” “categorizing,” “meaning”), effectively spotting mentions of natural‐language‐processing concepts.
New Auto-Interp
Negative Logits
Allocate
-0.06
-based
-0.06
.currentThread
-0.06
ical
-0.06
passport
-0.06
Initialized
-0.06
jurisdiction
-0.06
quicker
-0.06
洛
-0.05
cracked
-0.05
POSITIVE LOGITS
(".0.07
μένα
0.06
ossier
0.06
희
0.06
(*.
0.06
otel
0.06
unins
0.06
..↵
0.06
/',↵
0.06
$('.0.06
Activations Density 0.115%