INDEX
Explanations
references to places or concepts related to protection or safety
instances of the word "refuge" and related concepts indicating safety or shelter
New Auto-Interp
Negative Logits
ONES
-0.75
lass
-0.75
thin
-0.70
asts
-0.69
ym
-0.69
haw
-0.69
osc
-0.68
ahon
-0.66
ike
-0.66
oths
-0.66
POSITIVE LOGITS
refuge
0.99
Refuge
0.93
ctuary
0.77
ashtra
0.73
natureconservancy
0.70
Shooting
0.69
atoon
0.68
ilitation
0.66
Deity
0.66
UGE
0.65
Activations Density 0.016%