INDEX
Explanations
names or titles possibly related to family or community ties
references to parental figures or familial relationships
New Auto-Interp
Negative Logits
ser
-0.80
Edited
-0.77
binding
-0.72
PORT
-0.69
nc
-0.68
justice
-0.68
sports
-0.67
advertisement
-0.67
SELECT
-0.67
COL
-0.66
POSITIVE LOGITS
apa
1.27
Papa
1.19
omo
0.95
emonic
0.93
Mama
0.90
itone
0.83
artifacts
0.82
uppet
0.80
insula
0.79
awan
0.76
Activations Density 0.014%