INDEX
Explanations
words and phrases indicative of academic institutions and their designations
New Auto-Interp
Negative Logits
undi
-0.14
_footer
-0.14
Discovery
-0.14
ám
-0.14
akis
-0.14
ãĥ¼ãĥł
-0.14
688
-0.14
iegel
-0.14
io
-0.13
olume
-0.13
POSITIVE LOGITS
ALLY
0.16
Sinclair
0.15
usters
0.15
entifier
0.14
ally
0.14
CFG
0.14
ycz
0.14
_QMARK
0.14
erç
0.14
upo
0.14
Activations Density 0.007%