INDEX
Explanations
instances of unusual or out-of-context words or phrases
significant events or actions in a narrative context
New Auto-Interp
Negative Logits
Ħ¢
-0.75
uster
-0.59
manif
-0.57
gaining
-0.56
outp
-0.54
quadru
-0.54
extrater
-0.54
').
-0.54
ŃĶ
-0.53
risen
-0.53
POSITIVE LOGITS
↵
1.16
<|endoftext|>
0.95
↵↵
0.84
SPONSORED
0.82
CrossRef
0.76
argo
0.69
alia
0.62
PNG
0.62
âĺ
0.61
âĶľâĶĢâĶĢ
0.60
Activations Density 0.251%