INDEX
Explanations
sentences containing the word "a" followed by an adjective
phrases containing the article "a" followed by descriptors or nouns, suggesting evaluation or categorization
New Auto-Interp
Negative Logits
ographs
-0.79
units
-0.76
ansson
-0.75
effects
-0.75
aday
-0.72
itely
-0.70
Edit
-0.70
Element
-0.69
arcs
-0.67
iffs
-0.67
POSITIVE LOGITS
mistake
1.31
violation
1.19
hoax
1.14
ploy
1.11
nuisance
1.10
distraction
1.09
joke
1.09
betrayal
1.08
waste
1.06
failure
1.06
Activations Density 0.219%