INDEX
Explanations
words with the prefix "pl" or words that contain "pl"
New Auto-Interp
Negative Logits
aml
-0.16
conj
-0.15
invite
-0.15
ABI
-0.14
icted
-0.14
jekt
-0.14
sei
-0.14
kul
-0.13
apiro
-0.13
udas
-0.13
POSITIVE LOGITS
anned
0.25
ugs
0.24
ural
0.24
enty
0.23
ough
0.23
anning
0.23
ucky
0.22
ugging
0.22
ugged
0.22
umb
0.22
Activations Density 0.011%