INDEX
Explanations
definitions or descriptions within a text
instances of the term "definitions" and related terminology
New Auto-Interp
Negative Logits
ajo
-0.72
hop
-0.71
hops
-0.67
pal
-0.64
sold
-0.63
wash
-0.63
-0.62
Seat
-0.62
nearby
-0.62
oyal
-0.62
POSITIVE LOGITS
definitions
3.70
definition
2.65
Definitions
2.33
Definition
2.10
Definition
1.82
definition
1.77
meanings
1.75
descriptions
1.56
initions
1.55
defines
1.53
Activations Density 0.020%