INDEX
Explanations
that this neuron is looking for phrases related to communication or interaction
instances of the phrase "reaching out."
New Auto-Interp
Negative Logits
Carbuncle
-0.79
士
-0.75
Tsukuyomi
-0.67
asus
-0.62
reckoning
-0.61
rack
-0.61
Profit
-0.60
hemy
-0.59
MAT
-0.58
pancakes
-0.58
POSITIVE LOGITS
stretched
1.15
wards
0.85
ogene
0.71
invitations
0.70
via
0.69
rils
0.68
condolences
0.67
reprene
0.65
felt
0.65
worm
0.65
Activations Density 0.030%