INDEX
    Explanations

    phrases related to responsibility and obligation

    New Auto-Interp
    Negative Logits
    ãģ§ãģĻãģĮ
    -0.13
    ÑĪила
    -0.12
    istes
    -0.12
     اÛĮشاÙĨ
    -0.12
    '],$_
    -0.12
    ']!='
    -0.12
     ÙĪÛĮ
    -0.12
    ITEM
    -0.11
    нÑıÑĤ
    -0.11
     коÑĤоÑĢÑĭÑħ
    -0.10
    POSITIVE LOGITS
     it
    1.27
    å®ĥ
    0.94
     оно
    0.79
     It
    0.78
    It
    0.72
     nó
    0.71
    	it
    0.69
    _it
    0.69
    ,it
    0.67
     воно
    0.60
    Act Density 4.551%

    No Known Activations