INDEX
Explanations
terms related to consumption and consumerism
New Auto-Interp
Negative Logits
er
-0.20
ifications
-0.19
ifik
-0.17
دار
-0.17
aments
-0.16
zes
-0.16
chet
-0.15
canf
-0.15
irected
-0.15
ios
-0.15
POSITIVE LOGITS
ptive
0.40
ption
0.38
ptions
0.35
PTION
0.35
mate
0.34
ables
0.27
pt
0.27
mates
0.24
pta
0.22
pton
0.21
Activations Density 0.006%