INDEX
    Explanations

    dialogue sentences marked with punctuation at the end

    New Auto-Interp
    Negative Logits
    ktop
    -0.87
    acan
    -0.79
    undai
    -0.78
    etz
    -0.76
     spons
    -0.74
    lege
    -0.74
    isconsin
    -0.74
    adelphia
    -0.72
    isite
    -0.72
     confir
    -0.71
    POSITIVE LOGITS
    ITNESS
    1.06
    âĶĢâĶĢâĶĢâĶĢ
    0.94
    Reply
    0.92
    ¯¯¯¯¯¯¯¯
    0.86
    Suddenly
    0.85
    --------------------------------------------------------
    0.84
    Reward
    0.83
     CONTIN
    0.80
     Tears
    0.79
    Plug
    0.78
    Act Density 7.298%

    No Known Activations