INDEX
Explanations
numbers and codes related to organizations or locations
New Auto-Interp
Negative Logits
549
-0.20
565
-0.19
ĸ
-0.19
548
-0.18
865
-0.18
848
-0.18
Ķ
-0.18
864
-0.18
546
-0.18
677
-0.18
POSITIVE LOGITS
301
0.49
501
0.47
601
0.47
402
0.47
401
0.47
701
0.47
302
0.47
303
0.47
502
0.47
503
0.46
Activations Density 0.231%