INDEX
    Explanations

    words related to cultural or racial issues

    references to the term "Cultural Caviar" and related phrases

    New Auto-Interp
    Negative Logits
     spike
    -0.76
     switch
    -0.72
     switching
    -0.71
     capsule
    -0.67
     randomized
    -0.65
     phase
    -0.65
     toggle
    -0.64
     Toggle
    -0.61
     gauge
    -0.60
     Pinterest
    -0.59
    POSITIVE LOGITS
    iar
    4.88
    iae
    1.17
    icultural
    1.03
    ibia
    1.03
    iate
    1.01
    ials
    0.99
    ias
    0.99
    ipal
    0.98
    icult
    0.97
    iaries
    0.97
    Act Density 0.016%

    No Known Activations