INDEX
    Explanations

    percentages or statistical figures

    New Auto-Interp
    Negative Logits
     philos
    -0.75
     sourcing
    -0.69
     parach
    -0.68
     distilled
    -0.67
     snowball
    -0.66
     pudding
    -0.65
     stocking
    -0.65
     recycling
    -0.64
     masc
    -0.64
     stim
    -0.64
    POSITIVE LOGITS
    9
    1.37
    5
    1.37
    95
    1.33
    97
    1.32
    6
    1.32
    7
    1.31
    8
    1.31
    98
    1.27
    93
    1.27
    96
    1.25
    Act Density 0.067%

    No Known Activations