INDEX
Explanations
the repeated use of the word "have" in different contexts
New Auto-Interp
Negative Logits
weed
-0.76
Discuss
-0.74
catentry
-0.69
TG
-0.66
ove
-0.62
Returning
-0.62
airs
-0.62
AH
-0.62
IM
-0.61
acs
-0.60
POSITIVE LOGITS
been
1.08
been
0.95
gotta
0.95
seen
0.94
heard
0.79
gotten
0.78
chosen
0.76
Tube
0.74
seen
0.73
eaten
0.72
Activations Density 0.041%