INDEX
Explanations
instances of the word "illusion" and its variations
New Auto-Interp
Negative Logits
sheet
-0.15
ñana
-0.15
emat
-0.14
Xd
-0.14
yectos
-0.14
isz
-0.14
Baths
-0.14
hatt
-0.14
elor
-0.13
ắp
-0.13
POSITIVE LOGITS
rious
0.17
/mock
0.16
sworth
0.15
aye
0.14
etti
0.14
Chance
0.14
ota
0.14
nal
0.14
922
0.14
etten
0.14
Activations Density 0.009%