INDEX
Explanations
references to kitchen appliances and their associated issues
New Auto-Interp
Negative Logits
sing
-0.17
lez
-0.16
ãĥ£
-0.15
adier
-0.15
gings
-0.15
á»§y
-0.15
oc
-0.15
igo
-0.14
plevel
-0.14
gis
-0.14
POSITIVE LOGITS
ette
0.24
ettes
0.21
ware
0.20
utens
0.18
wares
0.18
/lab
0.17
æª
0.17
maid
0.16
aid
0.16
laden
0.16
Activations Density 0.023%