INDEX
    Explanations

    adverbs followed by actions

    New Auto-Interp
    Negative Logits
     entièrement
    0.94
     rentable
    0.89
     درجہ
    0.81
     sensational
    0.80
     frivolous
    0.79
     imaginative
    0.79
     potable
    0.79
     satirical
    0.78
     inexplic
    0.77
     inexplicable
    0.77
    POSITIVE LOGITS
    0.79
     всей
    0.76
    зе
    0.70
    อื่นๆ
    0.68
    たくさん
    0.68
    🥐
    0.67
    navbarNav
    0.65
     کدام
    0.64
     каждой
    0.64
    "".
    0.63
    Act Density 0.093%

    No Known Activations