INDEX
Explanations
personal reflections or introspection
statements expressing personal identity or self-awareness
New Auto-Interp
Negative Logits
angering
-0.70
Detailed
-0.69
SpaceEngineers
-0.67
targeting
-0.67
Officials
-0.66
Leaks
-0.65
Consumer
-0.65
CPU
-0.63
Intel
-0.63
Demand
-0.63
POSITIVE LOGITS
autobi
1.13
kindred
0.92
poet
0.91
Twain
0.87
biography
0.86
psychologist
0.86
poems
0.85
novelist
0.85
autobiography
0.84
narrator
0.82
Activations Density 1.482%