INDEX
Explanations
mentions of things being the same or identical
references to the concept of matching
New Auto-Interp
Negative Logits
Ïī
-0.70
OLOG
-0.70
********************************
-0.69
ase
-0.67
Daily
-0.66
################################
-0.65
ember
-0.65
aug
-0.65
gom
-0.64
bra
-0.64
POSITIVE LOGITS
matching
1.10
matched
1.05
matches
0.95
ees
0.81
match
0.80
inances
0.75
mism
0.74
paren
0.73
poons
0.71
pairs
0.71
Activations Density 0.010%