INDEX
Explanations
mentions of "downtown."
New Auto-Interp
Negative Logits
itſelf
-0.92
myſelf
-0.87
Theſe
-0.83
Jefus
-0.80
Monfieur
-0.79
faſt
-0.78
himſelf
-0.76
purpoſe
-0.75
pleaſure
-0.74
neceff
-0.74
POSITIVE LOGITS
api
0.77
library
0.67
API
0.63
fa
0.63
Api
0.60
Library
0.60
قایناقلار
0.59
API
0.55
Api
0.54
about
0.54
Activations Density 0.099%