INDEX
Explanations
occurrences of the word "have" and its variations, indicating discussions around possession or necessity
New Auto-Interp
Negative Logits
itself
-0.25
Its
-0.18
themselves
-0.18
its
-0.17
himself
-0.17
Its
-0.16
ties
-0.15
ince
-0.15
bana
-0.15
irse
-0.15
POSITIVE LOGITS
ourselves
0.22
seen
0.21
heard
0.20
known
0.20
Seen
0.19
Seen
0.18
heard
0.17
talked
0.17
learned
0.17
discussed
0.16
Activations Density 0.144%