INDEX
Explanations
mentions or references to something specific in a text
instances of the word "mentioned" and similar referencing phrases indicating prior discussions or points made
New Auto-Interp
Negative Logits
ococ
-0.72
bers
-0.70
mens
-0.68
generated
-0.67
quartered
-0.62
mediated
-0.62
uga
-0.62
asm
-0.61
earances
-0.61
rubble
-0.61
POSITIVE LOGITS
earlier
1.72
above
1.49
previously
1.31
yesterday
1.19
before
1.17
previous
1.10
last
1.06
preceding
1.03
above
0.99
before
0.98
Activations Density 0.121%