INDEX
    Explanations

    words related to obstacles or hindrances

    references to various types of barriers

    New Auto-Interp
    Negative Logits
    phis
    -0.82
    ivery
    -0.81
    ergy
    -0.75
    psc
    -0.69
    yrics
    -0.69
    sch
    -0.68
    opia
    -0.68
     largeDownload
    -0.67
    imbabwe
    -0.67
    iability
    -0.67
    POSITIVE LOGITS
     barriers
    1.31
     barrier
    1.15
     walls
    0.88
     erected
    0.87
     separating
    0.86
     obstacles
    0.83
     Walls
    0.82
     Barrier
    0.81
    riers
    0.77
    buster
    0.76
    Act Density 0.014%

    No Known Activations