INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     humour
    -0.08
     frequency
    -0.07
     mixture
    -0.07
     tumor
    -0.07
     durations
    -0.07
    Fat
    -0.07
    Samples
    -0.07
     tumors
    -0.07
     sample
    -0.07
     rhythm
    -0.06
    POSITIVE LOGITS
                                                                                 
    0.07
                                                                                  
    0.07
    SCRI
    0.07
     pursuant
    0.07
     libertin
    0.06
                                                                                
    0.06
    JKLM
    0.06
    enschaft
    0.06
    .LabelControl
    0.06
     SwiftUI
    0.06
    Act Density 0.005%

    No Known Activations