INDEX
Explanations
mentions of the word "wired"
words related to fatigue or exhaustion
New Auto-Interp
Negative Logits
ç¥ŀ
-0.72
Bei
-0.67
oulos
-0.66
amen
-0.65
MER
-0.61
Bundy
-0.61
Barcl
-0.61
romeda
-0.60
Ń·
-0.59
alter
-0.59
POSITIVE LOGITS
lessly
0.94
rums
0.85
ilage
0.83
anto
0.82
ansas
0.80
rum
0.80
itably
0.80
rack
0.79
ndra
0.77
dit
0.76
Activations Density 0.020%