INDEX
Negative Logits
�
-0.07
�
-0.07
Freund
-0.07
�
-0.07
Tamb
-0.07
Carlson
-0.07
bundan
-0.07
Wednesday
-0.07
Carn
-0.07
dancer
-0.07
POSITIVE LOGITS
IP
0.14
IP
0.13
ip
0.12
Ip
0.12
_ip
0.11
ip
0.11
Ip
0.10
I
0.09
(ip
0.09
IPv
0.09
Activations Density 0.010%