INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     inside
    -0.11
    /backend
    -0.11
    åĨħ
    -0.11
     downstairs
    -0.10
     underlying
    -0.10
     interiors
    -0.10
     ëĤ´
    -0.10
     behind
    -0.10
    inside
    -0.09
    ä¸ĭ
    -0.09
    POSITIVE LOGITS
     surface
    0.55
     Surface
    0.50
    surface
    0.45
    Surface
    0.43
     above
    0.36
    urface
    0.36
    _surface
    0.35
    .surface
    0.33
    (surface
    0.32
     Above
    0.32
    Act Density 0.095%

    No Known Activations