INDEX
Explanations
references to personal experiences and interactions
New Auto-Interp
Negative Logits
we
-0.61
our
-0.60
μας
-0.58
กัน
-0.57
själva
-0.56
Our
-0.56
hearts
-0.55
ConstraintMaker
-0.54
ourselves
-0.53
Our
-0.53
POSITIVE LOGITS
myself
1.22
myself
0.95
myſelf
0.93
Myself
0.85
my
0.82
ProtoMessage
0.81
moje
0.80
personalmente
0.80
nakalista
0.75
adaptiveStyles
0.72
Activations Density 0.567%