INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Theſe
    -0.90
     resourceCulture
    -0.88
    Espèce
    -0.88
     démocr
    -0.85
    AutoScale
    -0.83
     nargin
    -0.78
     noires
    -0.78
    AutoScaleMode
    -0.77
     Мексичка
    -0.75
     Jefus
    -0.75
    POSITIVE LOGITS
    s
    0.53
    <strong>
    0.52
    ↵↵
    0.51
    0.51
    ↵↵↵
    0.45
    [
    0.44
      
    0.43
    </strong>
    0.43
    <
    0.43
    FormTagHelper
    0.42
    Act Density 0.087%

    No Known Activations