INDEX
Explanations
references to searching for a lost item or character
New Auto-Interp
Negative Logits
fucking
-0.21
FUCK
-0.19
Fucking
-0.18
fuck
-0.18
fucks
-0.17
fucked
-0.17
urent
-0.17
hell
-0.17
fuck
-0.16
Fuck
-0.16
POSITIVE LOGITS
friendship
0.17
adventure
0.17
Friendship
0.16
eroon
0.15
Gross
0.15
humans
0.15
Hedge
0.15
Helpful
0.15
Humph
0.15
jal
0.15
Activations Density 0.235%