INDEX
Explanations
specific town names
references to societal or political issues and governance
New Auto-Interp
Negative Logits
@
-0.53
thanking
-0.49
Originally
-0.48
consists
-0.45
because
-0.44
[+
-0.44
spoilers
-0.43
ðŁij
-0.42
tonight
-0.40
FANTASY
-0.40
POSITIVE LOGITS
)).
0.92
.).
0.85
]).
0.77
]."
0.75
%).
0.75
).[
0.72
?).
0.71
.''.
0.69
}.
0.69
).
0.67
Activations Density 3.006%