INDEX
    Explanations

    indicators of changes in activity or measurements with directional arrows

    New Auto-Interp
    Negative Logits
     itſelf
    -0.77
    ValueStyle
    -0.73
    Gweler
    -0.72
     myſelf
    -0.71
     CreateTagHelper
    -0.69
     purpoſe
    -0.64
     reaſon
    -0.64
     ftate
    -0.61
     Theod
    -0.59
     laſt
    -0.59
    POSITIVE LOGITS
     ↑
    1.75
    0.74
    0.62
    
    0.62
     :=
    0.58
    этому
    0.56
     ^
    0.56
    énario
    0.56
    таратура
    0.56
    ↑↑
    0.56
    Act Density 0.139%

    No Known Activations