INDEX
Explanations
The neuron selectively activates on subwords ending in “scopic” (as in “arthroscopic,” “endoscopic,” etc.).
New Auto-Interp
Negative Logits
Hav
-0.07
Put
-0.07
fire
-0.07
pessoa
-0.06
Str
-0.06
prejudice
-0.06
地
-0.06
/weather
-0.06
quantities
-0.06
task
-0.06
POSITIVE LOGITS
अगर
0.07
defaultValue
0.07
Cisco
0.07
ason
0.07
.Design
0.06
.backward
0.06
oracle
0.06
чис
0.06
carrots
0.06
©©
0.06
Activations Density 0.001%