INDEX
Explanations
interactions between individuals with titles and formal language, suggesting a conversational setting
conversational phrases that include names and titles
New Auto-Interp
Negative Logits
igen
-0.86
cases
-0.69
fueling
-0.67
interchange
-0.67
edition
-0.62
projects
-0.62
polar
-0.62
seals
-0.62
exchanges
-0.60
suites
-0.60
POSITIVE LOGITS
Gentleman
0.85
Disciple
0.82
Beware
0.81
beware
0.78
thank
0.77
congratulations
0.76
Dear
0.73
please
0.72
Metatron
0.72
Mistress
0.72
Activations Density 0.248%