INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Μη
    -0.07
    on
    -0.06
    remarks
    -0.06
     homme
    -0.06
     أم
    -0.06
    (rp
    -0.06
     WORD
    -0.06
    ohn
    -0.06
     Won
    -0.06
    384
    -0.06
    POSITIVE LOGITS
     festival
    0.20
     Festival
    0.19
     festivals
    0.16
     Fest
    0.13
    estival
    0.12
    fest
    0.11
     Carnival
    0.11
     fest
    0.10
     carnival
    0.09
     Fiesta
    0.08
    Act Density 0.006%

    No Known Activations