INDEX
Explanations
dates in the format of month followed by day
specific references to a character or significant element in the context of a series
New Auto-Interp
Negative Logits
Attribution
-0.80
Kimber
-0.68
olver
-0.67
anni
-0.62
Trigger
-0.61
erers
-0.60
Actor
-0.59
replicate
-0.59
CW
-0.58
Patterns
-0.58
POSITIVE LOGITS
teenth
0.81
otom
0.78
peril
0.77
foundland
0.73
otomy
0.68
anamo
0.68
Gleaming
0.67
Downloadha
0.65
uten
0.65
izen
0.64
Activations Density 0.000%