INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     technolog
    -0.07
    Ant
    -0.07
               
    -0.07
             
    -0.07
     Quantity
    -0.06
     #
    -0.06
                   
    -0.06
    мент
    -0.06
    -automatic
    -0.06
    -0.06
    POSITIVE LOGITS
     here
    0.19
     Here
    0.16
    here
    0.14
    Here
    0.13
     HERE
    0.12
    HERE
    0.11
    .Here
    0.11
     hier
    0.10
    _here
    0.09
     herein
    0.09
    Act Density 0.059%

    No Known Activations