INDEX
    Explanations

    comparisons using the phrase "just as."

    New Auto-Interp
    Negative Logits
    anca
    -0.20
    ạm
    -0.18
    asma
    -0.17
    loub
    -0.17
    kaar
    -0.14
    rias
    -0.14
     thus
    -0.14
    uhl
    -0.14
    .fhir
    -0.14
    iou
    -0.14
    POSITIVE LOGITS
    arily
    0.15
    /cop
    0.15
    zeitig
    0.14
    CI
    0.14
     Ord
    0.13
    ty
    0.13
    aru
    0.13
    ufe
    0.13
    eru
    0.13
    anger
    0.13
    Act Density 0.020%

    No Known Activations