INDEX
    Explanations

    phrases indicating purpose or intention

    New Auto-Interp
    Negative Logits
    ing
    -0.76
    ostante
    -0.71
     polymorph
    -0.70
     avoient
    -0.70
     Gorg
    -0.69
    Tikang
    -0.68
    guenos
    -0.68
     Brutus
    -0.68
     culoare
    -0.66
    voorbeeld
    -0.65
    POSITIVE LOGITS
     nel
    1.05
     untuk
    0.99
    Untuk
    0.78
     nell
    0.78
    ในการ
    0.77
    Nel
    0.75
     để
    0.74
     برای
    0.73
     לה
    0.72
     Để
    0.70
    Act Density 0.028%

    No Known Activations