INDEX
Explanations
references to the name "Peter."
New Auto-Interp
Negative Logits
aries
-0.17
ez
-0.15
tul
-0.15
aghan
-0.15
eer
-0.15
trees
-0.15
arna
-0.15
yx
-0.14
si
-0.14
ergisi
-0.14
POSITIVE LOGITS
borough
0.45
bilt
0.34
loo
0.29
hof
0.26
Rabbit
0.26
boro
0.25
pan
0.24
kin
0.24
mann
0.24
Pan
0.24
Activations Density 0.010%