INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    boy
    -0.07
     takový
    -0.07
    dess
    -0.06
    ührung
    -0.06
     üz
    -0.06
     історії
    -0.06
     Juni
    -0.06
     دشمن
    -0.06
     baggage
    -0.06
    طبي
    -0.06
    POSITIVE LOGITS
    	cin
    0.06
     alleles
    0.06
     Nottingham
    0.06
    .nz
    0.06
     petroleum
    0.06
    _resolution
    0.06
    -speaking
    0.06
    -rating
    0.06
    0.06
     Prompt
    0.06
    Act Density 0.017%

    No Known Activations