INDEX
    Explanations

    words that express comparisons and contrasts in various contexts

    New Auto-Interp
    Negative Logits
    esh
    -0.14
     literal
    -0.14
     weekly
    -0.13
     textual
    -0.13
    idal
    -0.13
     
    -0.13
     permutations
    -0.13
     modulo
    -0.13
    ypical
    -0.13
     gratis
    -0.13
    POSITIVE LOGITS
     problem
    0.16
    маз
    0.16
     job
    0.16
     thing
    0.15
     story
    0.15
    thing
    0.15
     solution
    0.15
     problema
    0.15
     Olympics
    0.15
    ÂĿ
    0.15
    Act Density 0.661%

    No Known Activations