INDEX
Explanations
questions starting with "So what" followed by a verb
questions beginning with "what."
New Auto-Interp
Negative Logits
DAQ
-0.74
JM
-0.70
yers
-0.68
Kit
-0.68
Bound
-0.67
TM
-0.65
MER
-0.65
eds
-0.64
ERG
-0.63
KI
-0.62
POSITIVE LOGITS
happens
0.91
ensued
0.88
else
0.86
happened
0.85
etheless
0.85
transpired
0.84
separates
0.78
emerges
0.76
happ
0.75
exactly
0.73
Activations Density 0.076%