INDEX
Explanations
phrases emphasizing a high quantity or degree of something
expressions emphasizing abundance or excess
New Auto-Interp
Negative Logits
ares
-0.82
iques
-0.78
aves
-0.78
hammad
-0.77
cles
-0.77
ends
-0.77
runs
-0.76
arts
-0.74
asts
-0.72
breakers
-0.70
POSITIVE LOGITS
reason
0.99
overlap
0.99
evidence
0.98
misinformation
0.91
similarity
0.91
difference
0.90
possibility
0.90
confusion
0.89
indication
0.89
irony
0.88
Activations Density 0.113%