INDEX
Explanations
mentions of specific locations or names
words containing the letters "nt" and "iant" in various contexts
New Auto-Interp
Negative Logits
aciously
-0.67
hammer
-0.66
cano
-0.65
perty
-0.63
bench
-0.59
bait
-0.58
downward
-0.57
challeng
-0.55
mun
-0.55
STON
-0.55
POSITIVE LOGITS
e
1.11
ech
1.10
ean
1.10
yre
1.04
rek
1.02
oday
1.02
rix
1.00
own
0.97
ez
0.96
heastern
0.93
Activations Density 0.083%