INDEX
Explanations
phrases indicating categorization or classification
phrases that suggest a "sort of" or "kind of" expression
New Auto-Interp
Negative Logits
essors
-0.89
omers
-0.85
oons
-0.84
years
-0.83
Seconds
-0.83
ands
-0.83
anners
-0.83
ients
-0.82
inches
-0.82
aries
-0.81
POSITIVE LOGITS
consolation
0.90
intermediary
0.86
existential
0.84
miracle
0.84
miraculous
0.84
ideological
0.83
afterlife
0.82
magic
0.82
mechanism
0.82
relationship
0.80
Activations Density 0.052%