INDEX
Explanations
expressions of gratitude or casual conversation
New Auto-Interp
Negative Logits
Members
-0.64
Ladies
-0.63
Members
-0.63
Mitgliedern
-0.60
medlemmer
-0.60
Mitglieder
-0.56
Gentlemen
-0.56
members
-0.55
members
-0.54
-0.53
POSITIVE LOGITS
buddy
1.23
mate
1.13
dude
1.01
man
0.92
buddy
0.90
bud
0.87
mate
0.79
friend
0.77
Dude
0.77
bro
0.77
Activations Density 0.136%