INDEX
Explanations
mentions of the name "Jones."
New Auto-Interp
Negative Logits
httphttps
-0.48
']):
-0.45
:'/
-0.44
ErrUnexpectedEOF
-0.44
Gav
-0.43
льше
-0.43
Gav
-0.43
Tig
-0.42
***!
-0.42
PHIL
-0.42
POSITIVE LOGITS
Jones
2.36
Jones
2.22
JONES
2.19
jones
1.95
jones
1.81
琼
0.90
Джон
0.86
瓊
0.84
ONES
0.81
ーンズ
0.79
Activations Density 0.003%