INDEX
Explanations
dates, quantities, and estimations
words related to reporting, estimating, and revealing information or claims
New Auto-Interp
Negative Logits
ROM
-0.78
picture
-0.74
avorite
-0.74
isode
-0.73
uga
-0.72
apes
-0.67
otype
-0.66
pak
-0.65
orio
-0.65
Exit
-0.65
POSITIVE LOGITS
arg
0.69
Eth
0.66
unanimously
0.65
electr
0.64
Engels
0.63
¶
0.63
contention
0.62
ICO
0.61
contended
0.60
orally
0.59
Activations Density 0.096%