INDEX
    Explanations

    references to walls in various contexts

    New Auto-Interp
    Negative Logits
     Judson
    -0.81
     temprana
    -0.80
    epam
    -0.75
    })));
    -0.75
    észetes
    -0.74
     wezen
    -0.73
    Prat
    -0.73
     themſelves
    -0.73
     jonge
    -0.70
     pleaſure
    -0.69
    POSITIVE LOGITS
     wall
    2.38
     WALL
    2.29
     Wall
    2.27
    Wall
    2.10
     walls
    2.10
    wall
    2.06
    WALL
    1.94
     Walls
    1.92
    walls
    1.78
    Walls
    1.75
    Act Density 0.036%

    No Known Activations