INDEX
Explanations
words related to criminal activities, specifically thieves and robbery
terms related to thieves and robbery
New Auto-Interp
Negative Logits
zl
-0.79
akeru
-0.74
mberg
-0.71
inct
-0.69
enegger
-0.67
CBD
-0.66
eda
-0.65
)=(
-0.65
verend
-0.65
emb
-0.65
POSITIVE LOGITS
thieves
1.54
thief
1.35
Thieves
1.18
robbers
1.01
guild
0.96
burg
0.90
robber
0.87
Thief
0.87
burgl
0.85
undermin
0.85
Activations Density 0.010%