INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     then
    -1.15
     THEN
    -0.93
    then
    -0.91
    THEN
    -0.87
     entonces
    -0.75
    Then
    -0.73
     então
    -0.71
     DriverManager
    -0.67
     następnie
    -0.66
     затем
    -0.65
    POSITIVE LOGITS
     ‘
    0.54
    0.52
     King
    0.51
     '
    0.50
    <eos>
    0.49
    ,
    0.49
    ↵↵
    0.49
     Th
    0.48
     Man
    0.47
     Sir
    0.47
    Act Density 0.026%

    No Known Activations