INDEX
    Explanations

    explaining how things work

    New Auto-Interp
    Negative Logits
     affaires
    0.44
    𝔯
    0.43
     ""<
    0.43
     megawatts
    0.41
     infringer
    0.41
    λαν
    0.41
     servidor
    0.40
    0.38
     enfermed
    0.38
     transformer
    0.38
    POSITIVE LOGITS
     M
    0.56
     C
    0.51
     O
    0.50
     Key
    0.49
     G
    0.49
     F
    0.49
     P
    0.48
     Examples
    0.48
     How
    0.46
     Example
    0.46
    Act Density 0.632%

    No Known Activations