INDEX
Explanations
phrases related to connections between family and friends
references to family and friends in various contexts
New Auto-Interp
Negative Logits
_(
-0.72
WRITE
-0.67
fail
-0.66
TYPE
-0.63
Module
-0.62
lamm
-0.60
Layer
-0.60
ãĥĢ
-0.60
recomm
-0.59
iencies
-0.59
POSITIVE LOGITS
friends
1.64
neighbours
1.35
neighbors
1.35
siblings
1.31
acquaintances
1.30
grandparents
1.29
friends
1.28
grandchildren
1.25
coworkers
1.24
friendships
1.20
Activations Density 0.116%