INDEX
    Explanations

    prompts or instructions

    New Auto-Interp
    Negative Logits
    Actually
    0.60
    Didn
    0.59
     Actually
    0.59
     Otherwise
    0.58
     Didn
    0.58
    Otherwise
    0.56
     Gives
    0.54
    Honestly
    0.53
     Зараз
    0.53
     Somehow
    0.51
    POSITIVE LOGITS
     consider
    1.63
     let
    1.61
     hãy
    1.49
     please
    1.46
     remember
    1.36
     try
    1.33
     keep
    1.29
     ensure
    1.23
     assume
    1.19
     imagine
    1.19
    Act Density 0.643%

    No Known Activations