INDEX
    Explanations

    occurrences of the word "similar."

    New Auto-Interp
    Negative Logits
    es
    -0.65
    dymyr
    -0.63
    ate
    -0.60
    o
    -0.59
     sto
    -0.57
     *
    -0.56
     voz
    -0.54
    bnf
    -0.54
    arbox
    -0.53
    -0.53
    POSITIVE LOGITS
    RectangleBorder
    1.26
     nahilalakip
    1.22
     SIMILAR
    1.14
     Similar
    1.08
    Похо
    1.06
    iliar
    1.06
    Similar
    1.06
     similar
    1.03
    similar
    1.01
    évaluateur
    1.00
    Act Density 0.100%

    No Known Activations