INDEX
Explanations
instances of the word "knock."
New Auto-Interp
Negative Logits
internetowa
-0.44
nyc
-0.43
녔
-0.42
pangan
-0.40
caid
-0.40
kind
-0.40
genieten
-0.39
rierten
-0.39
ayanan
-0.37
Hire
-0.37
POSITIVE LOGITS
knock
0.91
knocking
0.88
Knock
0.84
knocks
0.82
knock
0.81
KN
0.74
Knock
0.71
knocked
0.69
hammer
0.68
Kno
0.66
Activations Density 1.751%