INDEX
Explanations
proper nouns, such as names of people and places
instances of the token "Te."
New Auto-Interp
Negative Logits
containment
-0.70
etheless
-0.69
ĸļ
-0.68
footing
-0.63
iewicz
-0.62
Jenner
-0.61
ESA
-0.61
Throne
-0.61
lessly
-0.61
totality
-0.61
POSITIVE LOGITS
achable
1.36
levision
1.32
aching
1.31
achers
1.27
acher
1.24
lev
1.20
brate
1.12
legram
1.09
legraph
1.07
ppo
1.05
Activations Density 0.018%