INDEX
Explanations
anonymous
This neuron detects occurrences of the concept of anonymity, especially the word “anonymous” and closely related contexts.
New Auto-Interp
Negative Logits
Well
-0.08
well
-0.08
Work
-0.07
Relation
-0.07
-hard
-0.06
(@(
-0.06
table
-0.06
Building
-0.06
атку
-0.06
경기
-0.06
POSITIVE LOGITS
Anonymous
0.09
anonymous
0.09
anonymous
0.09
anonymously
0.08
anonymity
0.08
AllowAnonymous
0.08
anon
0.08
mon
0.07
Anonymous
0.07
-os
0.07
Activations Density 0.004%