INDEX
Explanations
the word "what" followed by either a question or a description of something
the word "what" in various contexts, suggesting an exploration of inquiry or questioning
New Auto-Interp
Negative Logits
nationwide
-0.61
Mock
-0.58
misinterpret
-0.56
uating
-0.55
enger
-0.54
enburg
-0.54
ointed
-0.53
defect
-0.53
Warn
-0.52
eaves
-0.52
POSITIVE LOGITS
soever
1.63
constitutes
1.01
happens
0.97
amounts
0.91
else
0.90
follows
0.85
appears
0.81
feels
0.80
?).
0.79
seems
0.79
Activations Density 0.076%