INDEX
Explanations
Roman numerals
instances of the Roman numeral "ii" and other similar sequences
New Auto-Interp
Negative Logits
lain
-0.84
marsh
-0.70
convict
-0.68
derailed
-0.65
boards
-0.65
arresting
-0.64
marked
-0.63
ainted
-0.62
rooms
-0.61
hillary
-0.61
POSITIVE LOGITS
ii
1.48
iii
1.40
FY
1.01
II
0.97
aeda
0.95
ei
0.92
ordan
0.85
ulia
0.84
ii
0.84
III
0.84
Activations Density 0.003%