INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    TYPES
    -0.06
     meds
    -0.06
    Aggregate
    -0.06
    adar
    -0.06
     похож
    -0.06
     spectra
    -0.06
    -0.06
    312
    -0.06
     indoors
    -0.06
    .Class
    -0.06
    POSITIVE LOGITS
    ron
    0.07
     Trump
    0.07
    0.07
     transformers
    0.07
    .Topic
    0.07
     المؤ
    0.06
    0.06
     flattened
    0.06
    ellery
    0.06
    Trump
    0.06
    Act Density 0.002%

    No Known Activations