INDEX
Explanations
phrases starting with symbols that are typically not found in standard text
expressions related to financial support or expenditures
New Auto-Interp
Negative Logits
Seym
-0.77
Koch
-0.76
Belg
-0.76
shroud
-0.75
Opera
-0.72
Tasman
-0.72
accomp
-0.68
nodd
-0.67
Ples
-0.67
Railway
-0.65
POSITIVE LOGITS
their
1.01
there
0.98
someone
0.95
¡
0.92
ĸ
0.88
everything
0.88
his
0.88
him
0.88
your
0.84
perfect
0.84
Activations Density 0.240%