INDEX
Explanations
Security, safety, and risk
This neuron detects mentions of personal or sensitive user information being shared (e.g. locations, dates of birth, children’s names, travel plans, photos).
New Auto-Interp
Negative Logits
cre
-0.07
包
-0.07
Pull
-0.06
ians
-0.06
vibe
-0.06
Office
-0.06
caregiver
-0.06
clr
-0.06
Crime
-0.06
겠
-0.06
POSITIVE LOGITS
newly
0.07
ENDOR
0.06
orda
0.06
ология
0.06
دف
0.06
ontvangst
0.06
(of
0.06
instructions
0.06
NI
0.06
absolute
0.06
Activations Density 0.178%