INDEX
Explanations
This neuron detects mentions of the social‐media platform “TikTok” (including variants like “tik” + “tok”).
New Auto-Interp
Negative Logits
.High
-0.07
shirt
-0.07
ське
-0.07
Spurs
-0.07
غة
-0.07
дом
-0.07
票
-0.07
orf
-0.06
ského
-0.06
ANCELED
-0.06
POSITIVE LOGITS
Tik
0.08
foreseeable
0.06
defined
0.06
confidential
0.06
(IC
0.06
sữa
0.06
одав
0.06
etik
0.06
_ability
0.06
attempting
0.06
Activations Density 0.005%