INDEX
Explanations
ambiguous and enigmatic phrases
articles preceding nouns
New Auto-Interp
Negative Logits
adj
-0.91
grounds
-0.82
evidence
-0.75
Industries
-0.75
onto
-0.72
mite
-0.72
Things
-0.69
aniel
-0.69
encies
-0.68
here
-0.68
POSITIVE LOGITS
bang
1.06
bunch
1.04
handful
0.99
vengeance
0.99
plethora
0.98
few
0.92
multitude
0.90
smile
0.88
penchant
0.88
caveat
0.87
Activations Density 0.142%