INDEX
Explanations
references to "Bob" in various contexts
New Auto-Interp
Negative Logits
eren
-0.17
iffe
-0.15
lý
-0.14
ifikace
-0.14
ene
-0.14
erin
-0.14
izers
-0.14
Ïīν
-0.14
ebek
-0.14
ants
-0.14
POSITIVE LOGITS
bie
0.28
bi
0.23
Dylan
0.22
bing
0.21
би
0.21
cat
0.21
ble
0.20
cats
0.19
bies
0.18
erta
0.17
Activations Density 0.006%