INDEX
Explanations
instances of the word "Caution" and variations of the prefix "Ca"
New Auto-Interp
Negative Logits
го
-0.17
linger
-0.16
enheim
-0.16
anken
-0.15
@Bean
-0.14
.Atomic
-0.14
orthand
-0.14
voke
-0.14
mastering
-0.14
anks
-0.13
POSITIVE LOGITS
/ca
0.21
iflower
0.21
UTION
0.20
ution
0.18
(ca
0.18
ifornia
0.17
ibbean
0.17
esar
0.16
ca
0.15
iforn
0.15
Activations Density 0.023%