INDEX
Explanations
phrases related to explaining or understanding a concept
instances that indicate understanding or knowledge about a topic
New Auto-Interp
Negative Logits
wives
-0.65
Sov
-0.62
unfocusedRange
-0.61
approvals
-0.58
Virgin
-0.57
registry
-0.57
tolerated
-0.56
Votes
-0.56
canceled
-0.55
subcontract
-0.55
POSITIVE LOGITS
briefly
0.90
empir
0.84
analogy
0.83
examine
0.79
alogy
0.78
uncture
0.76
illuminating
0.76
illuminate
0.75
analyze
0.75
analy
0.74
Activations Density 0.573%