INDEX
Explanations
Roman numerals followed by a number
references to "IX" and related context within texts
New Auto-Interp
Negative Logits
collection
-0.75
isman
-0.72
kson
-0.67
Haku
-0.66
rill
-0.61
offer
-0.61
actor
-0.61
aff
-0.60
mot
-0.60
ischer
-0.59
POSITIVE LOGITS
IX
1.33
TERN
0.94
CEPT
0.84
DAQ
0.83
CLUS
0.82
daq
0.79
atile
0.79
BSD
0.79
edIn
0.79
alde
0.78
Activations Density 0.005%