INDEX
    Explanations

    instances of the word "compare" and its variations

    New Auto-Interp
    Negative Logits
    çĦ¶
    -0.19
    ereotype
    -0.17
    anna
    -0.16
    amping
    -0.15
    .nz
    -0.15
    еÑİ
    -0.14
    ough
    -0.14
    elt
    -0.14
    ivia
    -0.14
    imore
    -0.14
    POSITIVE LOGITS
     apples
    0.27
     favor
    0.22
     unfavor
    0.22
    favor
    0.22
     favour
    0.20
     between
    0.19
    rios
    0.18
    isons
    0.18
     notes
    0.18
    ãģ¹
    0.17
    Act Density 0.023%

    No Known Activations