INDEX
    Explanations

    affirmative and confirming statements in dialogue

    New Auto-Interp
    Negative Logits
    ardon
    -0.17
    tera
    -0.15
     Principle
    -0.15
     principle
    -0.15
    bero
    -0.14
     Fast
    -0.14
    590
    -0.14
     Merk
    -0.14
     Tep
    -0.14
    ibel
    -0.14
    POSITIVE LOGITS
    Ki
    0.15
    ayi
    0.14
    enin
    0.14
     HOLDERS
    0.14
     Äijá»ĭnh
    0.14
    kee
    0.14
    ester
    0.14
    ylim
    0.13
    peq
    0.13
    orm
    0.13
    Act Density 0.154%

    No Known Activations