INDEX
Explanations
references to safe places or havens
terms related to safe spaces or protective environments
New Auto-Interp
Negative Logits
lass
-0.90
ahon
-0.74
othy
-0.66
iam
-0.66
omen
-0.65
isable
-0.65
ivan
-0.65
iple
-0.65
thin
-0.65
inness
-0.63
POSITIVE LOGITS
ctuary
0.79
refuge
0.76
encl
0.75
bub
0.72
sanctuary
0.71
Refuge
0.71
mingham
0.71
pmwiki
0.69
havens
0.68
habitat
0.67
Activations Density 0.085%