INDEX
Explanations
phrases that introduce or emphasize a point of emphasis or contrast
dash patterns or interruptions in text formatting
New Auto-Interp
Negative Logits
protective
-0.70
onut
-0.68
baking
-0.68
emy
-0.67
grave
-0.67
obar
-0.67
discern
-0.65
heart
-0.65
iceberg
-0.64
drain
-0.64
POSITIVE LOGITS
lance
0.92
fuck
0.90
[[
0.90
)--
0.85
DOWN
0.85
FORE
0.84
NOW
0.83
==
0.83
SOURCE
0.83
_-
0.82
Activations Density 0.012%