INDEX
Explanations
sentences that express curiosity and fascination with the natural world and human experience
New Auto-Interp
Negative Logits
atori
-0.15
ibi
-0.15
rie
-0.15
ewise
-0.14
onica
-0.14
heard
-0.13
oomla
-0.13
θμ
-0.13
}elseif
-0.13
è̶
-0.13
POSITIVE LOGITS
fasc
0.32
Fasc
0.30
science
0.29
curiosity
0.28
Science
0.27
fascination
0.27
scientists
0.26
scientist
0.25
scientific
0.24
fascinating
0.24
Activations Density 0.248%