INDEX
Explanations
questions posed in the text
New Auto-Interp
Negative Logits
hya
-0.70
bably
-0.67
OUND
-0.67
paio
-0.64
Statistics
-0.61
iful
-0.59
hematically
-0.59
undai
-0.59
effects
-0.58
outcomes
-0.58
POSITIVE LOGITS
asks
0.78
asks
0.65
asked
0.65
inquired
0.62
asking
0.62
pond
0.61
quer
0.59
Bought
0.58
apologise
0.58
Quest
0.57
Activations Density 0.018%