INDEX
Explanations
titles, headings, or labels within a document
New Auto-Interp
Negative Logits
Jagu
-0.83
467
-0.82
SolidGoldMagikarp
-0.81
1947
-0.80
Invaders
-0.79
Kira
-0.79
ault
-0.78
Kra
-0.77
Archer
-0.77
Lich
-0.75
POSITIVE LOGITS
10
1.46
10
1.29
1070
1.09
2010
1.07
1027
1.02
ten
0.97
102
0.96
1050
0.94
2010
0.94
1024
0.93
Activations Density 0.439%