INDEX
    Explanations

    prepositions followed by entities/places/actions

    New Auto-Interp
    Negative Logits
     нельзя
    0.66
     placebo
    0.59
     sebagainya
    0.58
     cenderung
    0.57
     де
    0.57
     লোকজন
    0.56
     geen
    0.55
     δεν
    0.54
     спраши
    0.54
    っぽい
    0.54
    POSITIVE LOGITS
     aceste
    0.85
     aquest
    0.83
    aquest
    0.80
     această
    0.79
     هذه
    0.78
     acest
    0.76
     هذا
    0.73
     Acest
    0.73
     цієї
    0.73
    这位
    0.70
    Act Density 0.000%

    No Known Activations