INDEX
    Explanations

    informal phrases and conversational opinions about various topics

    New Auto-Interp
    Negative Logits
    .
    -0.56
     W
    -0.55
    -0.54
     w
    -0.45
      
    -0.45
    ?.
    -0.42
    );
    -0.42
    \
    -0.41
    ");
    -0.41
    \]
    -0.40
    POSITIVE LOGITS
     myſelf
    1.02
     Мексичка
    0.99
     ―――――
    0.98
     houſe
    0.91
     ſmall
    0.91
     ſche
    0.89
     leſs
    0.89
     raiſ
    0.89
     pleaſure
    0.88
     Савезне
    0.86
    Act Density 0.220%

    No Known Activations