INDEX
Explanations
sentence structures indicating the presence of information or knowledge
references to changes and what is known or requires action
New Auto-Interp
Negative Logits
gart
-0.77
asers
-0.69
aser
-0.66
IDs
-0.65
chairs
-0.60
heels
-0.59
Outbreak
-0.58
upiter
-0.57
nels
-0.57
motion
-0.57
POSITIVE LOGITS
done
0.99
accomplished
0.94
learnt
0.88
transpired
0.87
learned
0.83
glean
0.83
written
0.82
redacted
0.80
FINE
0.78
undone
0.76
Activations Density 0.149%