INDEX
Explanations
mentions of 'Our'
statements that convey ownership or collective identity
New Auto-Interp
Negative Logits
achy
-0.58
dash
-0.56
repe
-0.55
he
-0.55
level
-0.55
bothered
-0.55
itiz
-0.54
skills
-0.54
fx
-0.54
crossover
-0.54
POSITIVE LOGITS
Our
3.12
Our
2.40
OUR
2.03
We
1.85
our
1.59
ourselves
1.57
Your
1.50
ours
1.48
We
1.38
My
1.37
Activations Density 0.007%