INDEX
    Explanations

    phrases or expressions that evoke similarity or comparison

    New Auto-Interp
    Negative Logits
    -0.66
    GOTREF
    -0.55
     mijne
    -0.52
    IsPostBack
    -0.51
     sorella
    -0.50
    +#+
    -0.50
     Paglinawan
    -0.49
     något
    -0.49
     toalha
    -0.49
     keduanya
    -0.48
    POSITIVE LOGITS
     للمعارف
    0.56
     happening
    0.43
     clim
    0.41
    NameInMap
    0.40
    OutOf
    0.40
     Algebra
    0.40
    bleau
    0.39
     Gila
    0.38
     what
    0.38
    jší
    0.37
    Act Density 0.019%

    No Known Activations