INDEX
Explanations
dates in a specific format
mentions of numerical values or identifiers related to votes
New Auto-Interp
Negative Logits
ellation
-0.84
liga
-0.78
ype
-0.75
anooga
-0.74
enzie
-0.74
orney
-0.74
ensed
-0.72
andise
-0.72
arios
-0.70
illation
-0.70
POSITIVE LOGITS
th
1.00
00
0.90
26
0.81
017
0.80
650
0.79
66
0.79
31
0.79
teen
0.78
27
0.78
71
0.78
Activations Density 0.041%