INDEX
Explanations
mentions of specific events or experiences
phrases indicating previous experiences or lack thereof
New Auto-Interp
Negative Logits
akia
-0.64
eventually
-0.61
çīĪ
-0.58
shortly
-0.57
staggered
-0.57
momentarily
-0.56
abound
-0.54
Mub
-0.54
occasionally
-0.53
aided
-0.53
POSITIVE LOGITS
nor
1.20
anything
0.96
anywhere
0.96
anybody
0.96
anything
0.87
yet
0.86
since
0.82
anymore
0.79
anyone
0.79
EVER
0.78
Activations Density 0.315%