INDEX
Explanations
words related to transportation and locations
words related to trafficking
New Auto-Interp
Negative Logits
Tsukuyomi
-0.75
ĨĴ
-0.72
uncont
-0.71
empowerment
-0.69
autobi
-0.68
superhuman
-0.66
awakening
-0.65
monop
-0.65
leisure
-0.65
denial
-0.64
POSITIVE LOGITS
ï¸ı
0.98
andise
0.93
inary
0.88
uld
0.86
ousy
0.86
heed
0.84
nels
0.83
ulent
0.81
incial
0.80
oppers
0.78
Activations Density 0.096%