INDEX
    Explanations

    sentences that express curiosity and fascination with the natural world and human experience

    New Auto-Interp
    Negative Logits
    atori
    -0.15
    ibi
    -0.15
    rie
    -0.15
    ewise
    -0.14
    onica
    -0.14
    heard
    -0.13
    oomla
    -0.13
    θμ
    -0.13
    }elseif
    -0.13
    è̶
    -0.13
    POSITIVE LOGITS
     fasc
    0.32
     Fasc
    0.30
     science
    0.29
     curiosity
    0.28
     Science
    0.27
     fascination
    0.27
     scientists
    0.26
     scientist
    0.25
     scientific
    0.24
     fascinating
    0.24
    Act Density 0.248%

    No Known Activations