INDEX
    Explanations

    references to walls or barriers in various contexts

    New Auto-Interp
    Negative Logits
     temprana
    -0.84
    })));
    -0.82
     Plural
    -0.77
     inoxydable
    -0.76
     Judson
    -0.76
    selbe
    -0.76
     wezen
    -0.75
     pleaſure
    -0.75
    epam
    -0.74
    íso
    -0.73
    POSITIVE LOGITS
     wall
    2.07
     WALL
    2.02
     Wall
    1.94
     walls
    1.91
    Wall
    1.79
    wall
    1.78
     Walls
    1.75
    WALL
    1.68
    Walls
    1.61
    walls
    1.60
    Act Density 0.052%

    No Known Activations