INDEX
Negative Logits
م
0.72
spear
0.64
Data
0.63
Query
0.62
Country
0.62
large
0.61
ین
0.60
data
0.59
lock
0.59
Row
0.59
POSITIVE LOGITS
cigarettes
1.09
tobacco
1.01
cigarette
0.96
Tobacco
0.93
🚬
0.93
smoking
0.92
🚭
0.91
nicotine
0.90
Tobacco
0.89
cigarettes
0.88
Activations Density 0.026%