INDEX
    Explanations

    expressions of affirmation or agreement

    New Auto-Interp
    Negative Logits
    ]--;
    -0.81
    WriteBarrier
    -0.80
    NUMX
    -0.79
     '\\;'
    -0.78
    aarrggbb
    -0.78
    ".
    
    -0.75
     TestBed
    -0.75
    pozdrawiam
    -0.73
    _
    
    -0.73
     ),
    
    -0.71
    POSITIVE LOGITS
     Well
    2.04
    Well
    2.02
     WELL
    1.25
    well
    1.21
    WELL
    1.11
    Welp
    1.03
     well
    0.90
     Ну
    0.85
     Okay
    0.83
     Welles
    0.83
    Act Density 0.044%

    No Known Activations