INDEX
Explanations
mentions of "Dur" related to places or names
New Auto-Interp
Negative Logits
aking
-0.18
AS
-0.15
orp
-0.14
513
-0.14
urtles
-0.14
Focus
-0.14
Winston
-0.14
amos
-0.13
errat
-0.13
ae
-0.13
POSITIVE LOGITS
amework
0.17
inja
0.17
umo
0.15
inati
0.15
(_:
0.15
cumshot
0.15
/copyleft
0.14
ÑĢиÑı
0.14
lox
0.14
isma
0.14
Activations Density 0.003%