INDEX
Explanations
phrases indicating evidence or illustration of a point
terms that indicate evidence or illustration of concepts
New Auto-Interp
Negative Logits
hov
-0.76
query
-0.73
scribe
-0.72
adra
-0.71
nel
-0.69
pour
-0.68
frame
-0.67
quer
-0.67
ading
-0.66
ILCS
-0.64
POSITIVE LOGITS
umerable
0.89
illustrated
0.83
evidenced
0.82
ocument
0.77
imentary
0.76
shown
0.74
ĸļ
0.73
sbm
0.73
displays
0.72
shown
0.70
Activations Density 0.009%