INDEX
Explanations
dates and events mentioned in various contexts
references to specific films, events, or cultural products
New Auto-Interp
Negative Logits
Nope
-0.79
omething
-0.72
Unknown
-0.68
crappy
-0.66
whatever
-0.61
Rather
-0.61
Instead
-0.60
existent
-0.60
something
-0.59
Sure
-0.58
POSITIVE LOGITS
additionally
1.06
also
1.03
airs
1.00
furthermore
0.83
previously
0.80
further
0.76
Also
0.75
totaled
0.74
also
0.73
originally
0.72
Activations Density 0.586%