INDEX
Explanations
references to admiration or worship, particularly of individuals
references to "idol" and its variants in various contexts
New Auto-Interp
Negative Logits
20439
-0.79
RAW
-0.70
~~~~~~~~~~~~~~~~
-0.67
llular
-0.66
Dull
-0.65
raq
-0.65
uckles
-0.65
arnaev
-0.65
esome
-0.64
TPS
-0.62
POSITIVE LOGITS
idol
1.05
ãħĭ
0.93
idols
0.84
worshipped
0.84
imation
0.83
worsh
0.83
Idol
0.82
ãħĭãħĭ
0.82
imates
0.80
worship
0.77
Activations Density 0.018%