INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Arte
    -0.08
    %"><
    -0.08
     hObject
    -0.07
     activist
    -0.07
     borr
    -0.07
    ambiguous
    -0.07
    =======↵
    -0.06
     arte
    -0.06
    _
    -0.06
     struck
    -0.06
    POSITIVE LOGITS
     pool
    0.19
     Pool
    0.17
    pool
    0.17
     pools
    0.16
    Pool
    0.16
    POOL
    0.12
    ool
    0.12
    (pool
    0.12
    _pool
    0.10
    0.10
    Act Density 0.008%

    No Known Activations