INDEX
Explanations
references to ethical guidelines and institutional protocols
years in the 2000s, particularly around 2010-2020. The neuron shows high activation for numbers starting with '20' followed by another digit, which corresponds to years in the early 21st century.
New Auto-Interp
Negative Logits
InjectAttribute
-0.75
RenderAtEndOf
-0.68
متعلقه
-0.66
хьтан
-0.63
۸
-0.61
بوابة
-0.60
۴
-0.60
Personendaten
-0.60
۶
-0.60
ंदीखरीदारी
-0.59
POSITIVE LOGITS
1
2.31
1
1.19
১
1.09
১
1.01
१
0.95
١
0.89
١
0.86
१
0.85
۱
0.81
𝟭
0.76
Activations Density 7.719%