INDEX
Explanations
references to a specific place starting with the letters "Ur"
references to a specific location or entity related to Urbana
New Auto-Interp
Negative Logits
Sins
-0.76
iculty
-0.75
washer
-0.75
女
-0.75
Dickinson
-0.68
sylvania
-0.66
Misc
-0.66
ãģ¦
-0.66
æ³
-0.65
Scarlet
-0.64
POSITIVE LOGITS
gent
1.13
gencies
1.00
du
0.92
gery
0.91
pose
0.90
ulence
0.86
pee
0.86
gur
0.86
vey
0.84
seless
0.84
Activations Density 0.022%