INDEX
Explanations
identifiers related to items, possibly focusing on their codes or numerical values
New Auto-Interp
Negative Logits
å¹
-0.19
_phys
-0.17
Bundy
-0.17
897
-0.16
846
-0.16
WARE
-0.16
872
-0.15
į
-0.15
894
-0.15
Savage
-0.15
POSITIVE LOGITS
78
0.41
79
0.40
77
0.39
776
0.36
780
0.35
781
0.35
779
0.35
76
0.34
778
0.32
782
0.31
Activations Density 0.050%