INDEX
    Explanations

    verb followed by preposition

    New Auto-Interp
    Negative Logits
    of
    0.44
    اک
    0.43
     이야
    0.41
    Einstellungen
    0.40
     фаразы
    0.40
    含ま
    0.40
    ूज
    0.39
    ীষ
    0.39
    0.39
     этому
    0.39
    POSITIVE LOGITS
    ت
    0.71
     by
    0.59
    س
    0.57
     with
    0.56
     to
    0.53
     on
    0.53
    ina
    0.49
    с
    0.49
     from
    0.48
    et
    0.47
    Act Density 0.874%

    No Known Activations