INDEX
    Explanations

    communication

    New Auto-Interp
    Negative Logits
     Ranch
    -0.07
     inch
    -0.07
     Dataset
    -0.07
    _fault
    -0.07
    80
    -0.06
     top
    -0.06
    0
    -0.06
     pred
    -0.06
     most
    -0.06
     set
    -0.06
    POSITIVE LOGITS
     communication
    0.17
     Communication
    0.16
    Communication
    0.13
     communications
    0.12
     Communications
    0.11
     communicating
    0.11
     communicate
    0.11
    communication
    0.10
     communic
    0.10
    communic
    0.09
    Act Density 0.025%

    No Known Activations