INDEX
Explanations
occurrences of the word "note" and its variations
New Auto-Interp
Negative Logits
heit
-0.17
ãĥ¼ãĥĢ
-0.16
rost
-0.16
s
-0.16
ode
-0.15
iggs
-0.15
à¸ĩาà¸Ļ
-0.15
olph
-0.15
ÏĤ
-0.15
rette
-0.15
POSITIVE LOGITS
lessly
0.22
edly
0.21
books
0.19
exion
0.18
yssey
0.17
ãĥ¥
0.16
getManager
0.16
book
0.16
ìĤ¬íķŃ
0.15
ìĤ¬íķŃ
0.15
Activations Density 0.036%