INDEX
Explanations
variations of the word "inn" and related concepts
New Auto-Interp
Negative Logits
oftware
-0.18
chter
-0.15
gli
-0.15
instrument
-0.15
arkin
-0.15
OrNil
-0.14
/respond
-0.14
pher
-0.14
inaire
-0.14
pler
-0.14
POSITIVE LOGITS
keeper
0.27
keepers
0.27
ards
0.26
ately
0.23
smouth
0.21
erview
0.21
ervation
0.21
odb
0.20
KEEP
0.20
nn
0.19
Activations Density 0.006%