INDEX
Explanations
references to specific brands or products in the text
New Auto-Interp
Negative Logits
illac
-0.18
forks
-0.17
fried
-0.17
fullscreen
-0.16
sdale
-0.16
phan
-0.16
fence
-0.15
duk
-0.15
<fieldset
-0.15
loth
-0.14
POSITIVE LOGITS
(F
0.23
,F
0.21
FF
0.20
:F
0.20
FT
0.19
FR
0.18
F
0.18
/F
0.18
FG
0.18
(ft
0.18
Activations Density 0.217%