INDEX
Explanations
phrases about possession or ownership
the occurrence of the word "have" and its variations in different contexts
New Auto-Interp
Negative Logits
Shock
-0.62
Warm
-0.57
shock
-0.57
clicks
-0.56
Butcher
-0.56
DOT
-0.56
wait
-0.55
Boss
-0.55
âĺħ
-0.54
Barron
-0.53
POSITIVE LOGITS
been
1.12
gotten
1.00
seen
0.98
done
0.96
been
0.94
encountered
0.93
amassed
0.89
hed
0.89
undertaken
0.87
learnt
0.87
Activations Density 0.099%