INDEX
    Explanations

    each question, each turn

    New Auto-Interp
    Negative Logits
     brows
    1.30
     وص
    1.28
    ্লাহ
    1.25
    efeller
    1.24
     Arti
    1.23
     discret
    1.22
     intitul
    1.20
     پذیر
    1.20
     variar
    1.18
     thermodynamics
    1.15
    POSITIVE LOGITS
     quiera
    1.61
    о
    1.57
    hh
    1.46
    ies
    1.45
    е
    1.43
    ates
    1.39
    enting
    1.37
    age
    1.36
    ant
    1.34
    igned
    1.34
    Act Density 0.004%

    No Known Activations