INDEX
Explanations
phrases related to emphasizing a specific point or idea
New Auto-Interp
Negative Logits
hens
-0.77
tails
-0.76
ãĥ¼ãĤ¯
-0.72
oran
-0.71
obb
-0.70
orian
-0.69
orah
-0.68
uty
-0.67
istance
-0.67
arest
-0.66
POSITIVE LOGITS
they
0.88
pesky
0.80
there
0.79
THEY
0.79
fateful
0.75
someday
0.73
soever
0.73
kind
0.72
we
0.71
although
0.70
Activations Density 0.274%