INDEX
Explanations
phrases indicating lack of knowledge or uncertainty
instances of the phrase "didn't know."
New Auto-Interp
Negative Logits
ItemTracker
-0.89
ĪĴ
-0.85
isco
-0.83
phrine
-0.75
thren
-0.74
otion
-0.74
ishable
-0.72
uilding
-0.72
attery
-0.72
assi
-0.71
POSITIVE LOGITS
ledge
0.96
how
0.94
lege
0.91
ledged
0.87
exactly
0.85
beforehand
0.81
whether
0.78
ABOUT
0.76
enough
0.75
how
0.75
Activations Density 0.042%