INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    PopMatrix
    -0.07
     knockout
    -0.07
    =plt
    -0.07
    -0.06
     özel
    -0.06
    =@
    -0.06
     Proto
    -0.06
    合同
    -0.06
     jitter
    -0.06
    -0.06
    POSITIVE LOGITS
    .Label
    0.08
     Reef
    0.07
     admired
    0.07
    .remaining
    0.06
    restaurant
    0.06
     urges
    0.06
    .resource
    0.06
     ModelRenderer
    0.06
    _RW
    0.06
     Lords
    0.06
    Act Density 0.006%

    No Known Activations