INDEX
Explanations
references to the "Fourth" or court-related terms
New Auto-Interp
Negative Logits
3
-0.80
2
-0.78
8
-0.72
1
-0.69
6
-0.68
T
-0.65
5
-0.65
4
-0.64
I
-0.62
B
-0.60
POSITIVE LOGITS
fourth
2.04
fourth
1.96
Fourth
1.95
Fourth
1.89
seventh
1.72
Fifth
1.72
sixth
1.71
Sixth
1.68
fifth
1.67
fifth
1.65
Activations Density 0.113%