INDEX
Explanations
words associated with behavior and behavioral change
New Auto-Interp
Negative Logits
Laurens
-0.74
MILLIS
-0.65
Milne
-0.65
Clooney
-0.63
Tup
-0.62
rafted
-0.62
gzip
-0.61
Fitz
-0.60
Fitzpatrick
-0.60
Nip
-0.60
POSITIVE LOGITS
behavior
2.23
behaviour
2.11
Behavior
2.06
behavior
2.00
BEHAVIOR
1.93
Behavior
1.92
behaviors
1.91
Behaviour
1.88
behaviours
1.81
behaviour
1.81
Activations Density 0.071%