INDEX
Explanations
phrases related to illicit activities or wrongdoing
terms related to Silicon Valley and its associated culture
New Auto-Interp
Negative Logits
æ©
-0.88
RTX
-0.84
ãĥ³ãĤ¸
-0.83
guiActiveUnfocused
-0.78
ãĥ¼ãĥĨãĤ£
-0.76
oise
-0.75
ãĥ£
-0.74
ãĥ¼ãĥĨ
-0.71
ilage
-0.68
ãĥĥãĥĪ
-0.68
POSITIVE LOGITS
³³³³³³³³³³³³³³³³
0.87
terday
0.84
Lens
0.81
Shift
0.81
³³³³³³³³
0.75
gged
0.74
Film
0.71
berman
0.71
Else
0.71
ggle
0.70
Activations Density 0.030%