INDEX
Explanations
phrases inviting interaction or communication
New Auto-Interp
Negative Logits
Mub
-0.61
Parenthood
-0.61
rama
-0.60
Phelps
-0.60
abama
-0.58
Boh
-0.57
msec
-0.57
Maw
-0.57
notch
-0.57
Boko
-0.57
POSITIVE LOGITS
free
1.05
free
0.93
FREE
0.91
comfortable
0.88
obligated
0.87
safe
0.86
compelled
0.86
confident
0.85
good
0.84
FREE
0.81
Activations Density 0.028%