INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     specialties
    -0.09
    hazi
    -0.09
    (íģ¬ê¸°
    -0.09
     emblem
    -0.09
    iasi
    -0.09
    ноÑĪ
    -0.08
    rouw
    -0.08
    akh
    -0.08
    ToLocal
    -0.08
    èĪĹ
    -0.08
    POSITIVE LOGITS
     topic
    0.64
     subject
    0.52
    topic
    0.46
     Topic
    0.44
    Topic
    0.42
     theme
    0.41
    主é¢ĺ
    0.38
    subject
    0.38
    -topic
    0.37
     topics
    0.37
    Act Density 0.396%

    No Known Activations