INDEX
    Explanations

    phrases related to control and regulation

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.05
    3:0.07
    4:0.16
    5:0.03
    6:0.03
    7:0.33
    8:0.03
    9:0.03
    10:0.10
    11:0.08
    Negative Logits
    ���
    -1.80
    arnaev
    -1.58
    evidence
    -1.47
    bery
    -1.46
    lished
    -1.45
    nces
    -1.43
    pared
    -1.42
    payer
    -1.41
    pite
    -1.40
    ilipp
    -1.39
    POSITIVE LOGITS
     functions
    1.64
     hunger
    1.54
     vibration
    1.53
     Alexa
    1.45
     airflow
    1.45
     brightness
    1.43
     axis
    1.42
     potency
    1.37
     behaviors
    1.36
     playback
    1.34
    Act Density 0.023%

    No Known Activations