INDEX
Explanations
references to "Kent" or "Kentucky."
New Auto-Interp
Negative Logits
arness
-0.17
geh
-0.17
aturas
-0.16
aday
-0.15
olik
-0.15
rau
-0.15
cona
-0.15
amilia
-0.14
ce
-0.14
sent
-0.14
POSITIVE LOGITS
ucky
0.39
mere
0.24
aro
0.22
UCK
0.22
uck
0.21
uky
0.21
ucker
0.19
ekli
0.18
eken
0.17
lage
0.17
Activations Density 0.005%