INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Baltimore
    1.19
     Vanderbilt
    1.19
     Yoruba
    1.18
     Berkeley
    1.18
     Louisville
    1.16
     Beirut
    1.16
     Yelp
    1.13
     Isis
    1.13
     Detective
    1.12
     Danville
    1.12
    POSITIVE LOGITS
     ś
    1.53
     przy
    1.43
    1.39
    ł
    1.38
     š
    1.38
    č
    1.38
    1.37
     już
    1.37
     tylko
    1.36
    1.35
    Act Density 0.052%

    No Known Activations