INDEX
Explanations
words related to legal and political matters
New Auto-Interp
Negative Logits
anwhile
-0.73
Manhattan
-0.70
scattering
-0.69
Send
-0.67
scatter
-0.67
exhib
-0.65
collect
-0.64
Yon
-0.63
Whit
-0.62
shack
-0.60
POSITIVE LOGITS
¬
1.33
ľ
1.17
º
1.07
ı
1.04
Ĵ
1.03
Ķ
1.03
¡
1.01
¼
1.01
ĸ
0.99
ĭ
0.99
Activations Density 1.600%