INDEX
Explanations
phrases with the word "have" followed by some additional description or context
the phrase "we have" in various contexts
New Auto-Interp
Negative Logits
cone
-0.75
eem
-0.74
oshi
-0.70
drive
-0.68
tip
-0.68
conom
-0.67
roll
-0.65
alter
-0.65
ensing
-0.63
icking
-0.62
POSITIVE LOGITS
seen
1.20
ourselves
1.14
witnessed
1.07
heard
1.05
been
0.96
talked
0.96
learned
0.95
gotten
0.95
reached
0.94
learnt
0.92
Activations Density 0.134%