INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    TintMode
    -0.68
    ValueGeneration
    -0.68
    titleMargin
    -0.68
    Trọng
    -0.66
     Мексичка
    -0.65
    PreferredItem
    -0.65
    DebuggerStep
    -0.64
    daß
    -0.63
    tôi
    -0.63
    ToAction
    -0.62
    POSITIVE LOGITS
    <h1>
    1.72
    0.65
    <h2>
    0.61
    The
    0.54
    addContainerGap
    0.54
    <strong>
    0.52
    Plus
    0.49
    <bos>
    0.48
    Q
    0.48
    B
    0.48
    Act Density 0.018%

    No Known Activations