INDEX
Explanations
names or terms related to specific people or locations
proper nouns related to names and places
New Auto-Interp
Negative Logits
cerning
-0.60
ocating
-0.57
ulative
-0.56
ancies
-0.56
lycer
-0.55
ifty
-0.54
atures
-0.54
anchester
-0.53
arine
-0.53
rius
-0.52
POSITIVE LOGITS
shit
0.56
bash
0.54
vals
0.52
Lines
0.52
¥µ
0.51
zon
0.50
icho
0.50
Redditor
0.50
scl
0.50
strap
0.50
Activations Density 0.990%