INDEX
Explanations
Roman numerals associated with particular phrases or words
references to structured categories or classifications, especially in relation to entertainment, volumes, and historical figures
New Auto-Interp
Negative Logits
uckland
-0.70
heed
-0.60
TAM
-0.59
mouth
-0.59
dent
-0.58
tem
-0.58
panic
-0.56
Teg
-0.55
chance
-0.55
expensive
-0.55
POSITIVE LOGITS
III
1.98
VII
1.86
II
1.85
VIII
1.85
IV
1.83
III
1.82
VI
1.78
II
1.73
XII
1.69
VII
1.66
Activations Density 0.310%