INDEX
    Explanations

    terms and conditions or legal language in documents

    New Auto-Interp
    Negative Logits
    <bos>
    -1.99
     started
    -0.69
     got
    -0.64
     went
    -0.64
     came
    -0.64
     seemed
    -0.63
    ,
    -0.63
     wanted
    -0.62
     began
    -0.62
     helped
    -0.62
    POSITIVE LOGITS
     lele
    1.56
     vasi
    1.54
     stockholm
    1.52
     seksi
    1.52
     wien
    1.52
     cabrio
    1.50
     saar
    1.49
     maroc
    1.48
     socie
    1.47
     „,
    1.46
    Act Density 0.570%

    No Known Activations