INDEX
Explanations
phrases related to updates or announcements
the presence of the word "We" and its variations, indicating a focus on collective or inclusive statements
New Auto-Interp
Negative Logits
forms
-0.59
guiActiveUnfocused
-0.59
panic
-0.58
REDACTED
-0.58
eviction
-0.58
flows
-0.57
Reply
-0.56
Leilan
-0.56
âĸ¬
-0.55
Flavoring
-0.55
POSITIVE LOGITS
're
1.17
eks
1.06
athered
1.05
've
1.03
asel
1.02
igh
1.02
akening
1.00
ibo
0.99
arers
0.99
'll
0.96
Activations Density 0.138%