INDEX
Explanations
words related to accolades and praises
instances of excitement or joy
New Auto-Interp
Negative Logits
acia
-0.80
puter
-0.74
ible
-0.74
itri
-0.72
ances
-0.71
Nadu
-0.69
pering
-0.68
iance
-0.67
BIP
-0.65
Impossible
-0.64
POSITIVE LOGITS
^^^^
1.19
^^
0.88
hammad
0.74
vernment
0.73
nton
0.72
orters
0.72
^
0.66
gio
0.66
Schwarz
0.66
~~~~~~~~
0.65
Activations Density 0.020%