INDEX
Explanations
Roman numerals representing historical figures or events
Roman numerals associated with historical figures or events
New Auto-Interp
Negative Logits
bies
-0.88
boards
-0.87
feed
-0.81
href
-0.80
sight
-0.78
bery
-0.75
lins
-0.73
ãĥ¡
-0.71
cho
-0.70
ãĤ±
-0.69
POSITIVE LOGITS
III
1.21
II
0.97
Jinping
0.88
III
0.79
VIII
0.77
oodoo
0.74
arios
0.74
Aus
0.71
Centauri
0.70
VII
0.69
Activations Density 0.017%