INDEX
Explanations
occurrences of the name "Jack."
New Auto-Interp
Negative Logits
gebung
-0.16
onga
-0.16
izar
-0.15
Invent
-0.15
loor
-0.15
ulo
-0.15
ãĤ²
-0.14
νομα
-0.14
akk
-0.14
бой
-0.14
POSITIVE LOGITS
rabbit
0.33
ass
0.29
pot
0.28
knife
0.28
fruit
0.27
ie
0.26
rab
0.26
asses
0.25
hammer
0.25
aroo
0.24
Activations Density 0.010%