INDEX
Explanations
mentions of the term "Ayatollah."
New Auto-Interp
Negative Logits
adden
-0.19
nels
-0.19
createClass
-0.17
viÄį
-0.17
Starr
-0.16
zd
-0.16
ective
-0.15
cz
-0.15
fic
-0.15
mutation
-0.15
POSITIVE LOGITS
urved
0.23
AILABLE
0.17
akens
0.16
erdem
0.15
onus
0.14
Quadr
0.14
aley
0.14
ilon
0.14
sembly
0.14
Ãłng
0.14
Activations Density 0.017%