INDEX
    Explanations

    hiking with other activities

    New Auto-Interp
    Negative Logits
    1.48
    ۰
    1.19
    اد
    1.05
    sk
    1.02
    ной
    1.02
    f
    1.01
    that
    0.99
    ні
    0.99
    ่า
    0.98
    که
    0.97
    POSITIVE LOGITS
    ه
    1.30
    a
    1.17
     hikers
    1.11
    ,
    1.06
     hikes
    0.98
     hiking
    0.97
    Hiking
    0.95
     on
    0.95
     hiked
    0.93
    ,’
    0.92
    Act Density 0.003%

    No Known Activations