INDEX
Explanations
references to the name "Peter."
New Auto-Interp
Negative Logits
spe
-0.18
Evet
-0.16
nave
-0.16
pled
-0.15
anager
-0.15
/stdc
-0.15
InstanceState
-0.15
.plugins
-0.15
arna
-0.15
sit
-0.14
POSITIVE LOGITS
borough
0.33
bilt
0.28
pan
0.20
_pan
0.19
loo
0.19
Damian
0.19
boro
0.19
Pan
0.18
Rabbit
0.18
kin
0.18
Activations Density 0.013%