INDEX
    Explanations

    phrases relating to scrutiny and assessment of actions and relationships

    New Auto-Interp
    Negative Logits
    `;
    
    -1.03
    `,
    
    -1.00
    ^(@)
    -0.97
     `;
    -0.94
    `
    
    -0.92
    %";
    -0.88
     (\%
    -0.84
    `),
    -0.83
    ImageContext
    -0.83
    ^(@
    -0.82
    POSITIVE LOGITS
    
    2.47
     
    1.94
    
    1.86
    .
    1.66
    
    
    1.00
    ​​
    0.89
    0.87
    0.80
    #
    0.79
    0.78
    Act Density 0.107%

    No Known Activations