INDEX
Explanations
words related to people's names, often including their titles or nicknames
quotations from dialogue or speech
New Auto-Interp
Negative Logits
reconcil
-0.76
intersections
-0.74
reconcile
-0.73
overturn
-0.73
coincide
-0.72
inauguration
-0.72
recount
-0.71
forgiven
-0.69
reorgan
-0.69
orgasm
-0.67
POSITIVE LOGITS
Sem
1.05
Fat
1.02
Thor
0.96
Big
0.95
Hung
0.94
Fuck
0.92
Arm
0.91
Happy
0.91
Skip
0.91
vik
0.90
Activations Density 0.036%