INDEX
    Explanations

    abstract nouns followed by prepositions

    New Auto-Interp
    Negative Logits
     Batman
    0.34
     you
    0.32
    t
    0.32
     úpl
    0.31
     Buenos
    0.31
     Santa
    0.31
     University
    0.28
     theaters
    0.28
     Sk
    0.28
     [...
    0.28
    POSITIVE LOGITS
    становление
    0.35
    фра
    0.34
     incidences
    0.32
    меча
    0.30
     предполагает
    0.30
     щодо
    0.29
    하는
    0.29
    的三
    0.29
    اتها
    0.29
     местные
    0.29
    Act Density 0.065%

    No Known Activations