INDEX
    Explanations

    references to "surface."

    New Auto-Interp
    Negative Logits
     Grot
    -0.52
     kysy
    -0.50
     laaj
    -0.50
     quotations
    -0.48
    hero
    -0.48
     prospective
    -0.47
     demonstration
    -0.47
     Sanderson
    -0.46
    oura
    -0.46
     proposals
    -0.46
    POSITIVE LOGITS
     thanks
    0.70
     surface
    0.66
    surface
    0.63
     Surface
    0.62
     SURFACE
    0.61
     lenker
    0.57
    SURFACE
    0.57
    Surface
    0.56
     gracias
    0.55
     graças
    0.54
    Act Density 0.280%

    No Known Activations