INDEX
    Explanations

    the word "time", and sometimes "heat"

    New Auto-Interp
    Negative Logits
    <eos>
    -0.94
    .
    -0.91
    -0.89
     the
    -0.84
    ↵↵
    -0.82
     "
    -0.80
     “
    -0.77
    ,
    -0.77
     a
    -0.75
     to
    -0.74
    POSITIVE LOGITS
     Efq
    1.70
     Monfieur
    1.63
     Theſe
    1.61
     Reſ
    1.59
     myſelf
    1.58
     iſt
    1.49
     Anſ
    1.46
     pleaſure
    1.43
     Houſe
    1.38
     Jefus
    1.38
    Act Density 0.282%

    No Known Activations