INDEX
Explanations
references to a specific brand or product related to hair
New Auto-Interp
Negative Logits
raiding
-0.69
retiring
-0.69
emale
-0.67
wearable
-0.67
oston
-0.66
worker
-0.66
robbing
-0.62
rade
-0.61
stocking
-0.61
rugged
-0.60
POSITIVE LOGITS
EngineDebug
0.84
hz
0.73
ña
0.72
atal
0.68
vari
0.68
natureconservancy
0.67
ko
0.67
e
0.67
hani
0.66
eon
0.65
Activations Density 0.021%