INDEX
Explanations
expressions of personal reflection and emotional evolution
New Auto-Interp
Negative Logits
inde
-0.16
eree
-0.15
today
-0.15
indeed
-0.15
edin
-0.14
rocker
-0.13
scoreboard
-0.13
reporter
-0.13
raph
-0.13
Their
-0.13
POSITIVE LOGITS
haha
0.20
fucked
0.18
fuck
0.16
sort
0.16
Fuck
0.15
fucks
0.15
[of
0.15
shit
0.15
shitty
0.15
[from
0.14
Activations Density 0.027%