INDEX
    Explanations

    instances of conversational prompts or questions in dialogues

    New Auto-Interp
    Negative Logits
    ott
    -0.16
     Woj
    -0.15
    .arg
    -0.15
    šti
    -0.15
    arg
    -0.15
    edin
    -0.15
    .entry
    -0.14
    arrants
    -0.14
    ervas
    -0.14
    isa
    -0.14
    POSITIVE LOGITS
    eza
    0.15
    abler
    0.14
    ocht
    0.14
     unanimous
    0.14
    abilia
    0.14
    ubic
    0.13
    aterangepicker
    0.13
    ÎŃλ
    0.13
    ÏĥÏĨ
    0.13
    ÙĬراÙĨ
    0.13
    Act Density 0.043%

    No Known Activations