INDEX
Explanations
phrases containing the word "our"
instances of the word "our" emphasizing shared experiences or collective ownership
New Auto-Interp
Negative Logits
WAR
-0.70
Rasmussen
-0.68
FU
-0.68
Rove
-0.66
DERR
-0.63
Crosby
-0.63
Tanz
-0.61
Lovecraft
-0.59
aughtered
-0.59
buckle
-0.58
POSITIVE LOGITS
selves
1.32
neau
1.09
neys
1.06
dain
0.95
cery
0.93
¯¯¯¯¯¯¯¯
0.89
ishment
0.89
izons
0.86
izont
0.83
idine
0.81
Activations Density 0.018%