INDEX
    Explanations

    starting a response with OK

    New Auto-Interp
    Negative Logits
    Cur
    0.40
    WHEN
    0.38
    WHERE
    0.36
    Dun
    0.36
    Tras
    0.36
    RAM
    0.36
    QUE
    0.35
    വെ
    0.35
    র্ড
    0.34
    helpTool
    0.34
    POSITIVE LOGITS
     fine
    0.82
     okay
    0.75
     OK
    0.72
     FINE
    0.70
    Fine
    0.68
     Okay
    0.67
    fine
    0.67
     Fine
    0.66
    ओके
    0.66
     ok
    0.61
    Act Density 0.057%

    No Known Activations