INDEX
Explanations
interesting and diverse topics, from books and literature critiques to political statements and scientific discussions
New Auto-Interp
Negative Logits
referen
-0.62
umerous
-0.61
customary
-0.61
brush
-0.59
combination
-0.59
foreground
-0.58
inhibitor
-0.58
muted
-0.58
inclined
-0.57
spurious
-0.57
POSITIVE LOGITS
Lives
0.91
Matters
0.91
Dreams
0.90
Secrets
0.89
Yourself
0.89
Strategies
0.88
Lies
0.87
Madness
0.87
Files
0.86
Tomorrow
0.85
Activations Density 0.471%