INDEX
Explanations
profane and emphatic expressions
expressions of frustration or strong emphasis
New Auto-Interp
Negative Logits
IDS
-0.69
OHN
-0.68
DN
-0.64
KEN
-0.63
Flavoring
-0.62
quo
-0.61
ynthesis
-0.60
KY
-0.57
amide
-0.57
":""},{"-0.57
POSITIVE LOGITS
near
1.05
ibly
1.01
ation
0.86
atio
0.84
orse
0.77
near
0.74
ably
0.72
ated
0.72
damned
0.72
emed
0.72
Activations Density 0.024%