INDEX
Explanations
technical definitions
This neuron primarily activates on capitalized tokens (proper nouns, acronyms, or other words beginning with an uppercase letter).
New Auto-Interp
Negative Logits
анд
-0.07
.original
-0.07
congreg
-0.06
ROLL
-0.06
rectangular
-0.06
ौड
-0.06
บค
-0.06
ीआई
-0.06
Zinc
-0.06
abc
-0.06
POSITIVE LOGITS
.toJSONString
0.08
<object
0.07
Threads
0.07
Mặt
0.06
]↵↵↵
0.06
fixing
0.06
Array
0.06
AGAIN
0.06
.Name
0.06
die
0.06
Activations Density 0.080%