INDEX
Explanations
occurrences of the word "command" or related terms
references to military or authoritative commands
New Auto-Interp
Negative Logits
sung
-0.70
RAG
-0.68
Investor
-0.65
bley
-0.64
iders
-0.64
iate
-0.62
Hawking
-0.62
Ô
-0.61
Bloom
-0.60
ONSORED
-0.60
POSITIVE LOGITS
eering
1.39
eers
1.11
eer
1.01
line
1.00
ments
0.99
ment
0.94
line
0.91
sender
0.85
aining
0.84
ement
0.84
Activations Density 0.034%