INDEX
    Explanations

    phrases indicating comparison of different entities in terms of qualities

    New Auto-Interp
    Negative Logits
     Kün
    -0.77
     Simult
    -0.65
    Välislingid
    -0.59
     Ueb
    -0.58
     lele
    -0.58
     meras
    -0.57
     bambu
    -0.56
     tortas
    -0.55
     Jä
    -0.55
    Avez
    -0.55
    POSITIVE LOGITS
     WALTZ
    0.59
     not
    0.56
     nemici
    0.56
     ladri
    0.55
    setzer
    0.55
     giapp
    0.54
     nemico
    0.52
    not
    0.52
     disrespect
    0.51
     meant
    0.50
    Act Density 0.146%

    No Known Activations