INDEX
Explanations
mentions of significant numerical figures or statistics
New Auto-Interp
Negative Logits
jri
-0.79
beh
-0.76
milo
-0.73
ional
-0.70
urally
-0.70
bably
-0.70
seiz
-0.69
elig
-0.69
wagen
-0.68
smanship
-0.68
POSITIVE LOGITS
Expand
1.25
Close
0.82
CLOSE
0.80
Unlock
0.80
Unable
0.77
Copyright
0.75
License
0.75
WATCH
0.71
Govern
0.71
Submit
0.71
Activations Density 0.002%