INDEX
    Explanations

    references to doors and their mechanisms

    New Auto-Interp
    Negative Logits
    󠁿
    -0.83
    liesslich
    -0.81
    "")
    -0.80
    anyahu
    -0.79
     HasFactory
    -0.78
    ']],
    -0.78
     Palmas
    -0.77
    --
    
    -0.76
    ")),
    -0.74
    )]$
    -0.74
    POSITIVE LOGITS
     doors
    1.71
     door
    1.64
     Doors
    1.60
     Door
    1.57
    door
    1.55
     DOOR
    1.50
    Door
    1.45
    Doors
    1.42
    doors
    1.33
    DOOR
    1.18
    Act Density 0.056%

    No Known Activations