INDEX
Explanations
words related to protection or authority figures
references to guardianship or protector roles in various contexts
New Auto-Interp
Negative Logits
urses
-0.76
mers
-0.72
mia
-0.71
bers
-0.69
hist
-0.68
urred
-0.67
cloth
-0.66
RESULTS
-0.66
reme
-0.65
skirts
-0.65
POSITIVE LOGITS
hip
1.01
arium
0.93
hips
0.93
guardians
0.93
Guardians
0.88
ians
0.86
arians
0.85
iola
0.81
angel
0.80
Guard
0.79
Activations Density 0.026%