INDEX
    Explanations

    phrases that indicate relationships or conditions between entities, often expressed through relative clauses

    New Auto-Interp
    Negative Logits
    -0.59
    らう
    -0.57
    -0.53
     ferie
    -0.53
     vieja
    -0.53
    fallen
    -0.52
     ratify
    -0.52
     waking
    -0.52
     Team
    -0.52
    drücken
    -0.52
    POSITIVE LOGITS
     who
    1.08
     ambao
    0.94
     which
    0.93
     []:
    0.92
     οποίο
    0.91
    which
    0.86
     które
    0.85
    who
    0.84
     quien
    0.83
     والذي
    0.82
    Act Density 0.373%

    No Known Activations