INDEX
Explanations
instances of the word "shut" and its variations
New Auto-Interp
Negative Logits
idis
-0.17
/=
-0.15
ont
-0.15
aggio
-0.15
haled
-0.15
illet
-0.15
headline
-0.14
207
-0.14
y
-0.14
iac
-0.14
POSITIVE LOGITS
ters
0.43
ting
0.31
tings
0.29
TING
0.23
down
0.22
tdown
0.22
ty
0.22
TER
0.21
tl
0.21
doors
0.21
Activations Density 0.008%