INDEX
    Explanations

    phrases related to similarity and comparisons

    New Auto-Interp
    Negative Logits
    "]);
    
    -0.67
    "];
    
    -0.64
     án
    -0.58
    "]));
    -0.56
    ++]=
    -0.56
    )');
    -0.56
    autés
    -0.56
     ');
    -0.55
    ">//
    -0.55
    bitat
    -0.54
    POSITIVE LOGITS
     same
    1.55
    Same
    1.50
    same
    1.49
     similar
    1.45
     Same
    1.39
    SAME
    1.29
     hetzelfde
    1.26
    Similar
    1.25
     identical
    1.22
    similar
    1.22
    Act Density 0.591%

    No Known Activations