INDEX
Explanations
terms related to specific categories or items
terms related to various types of objects, actions, and concepts
nouns after newlines, points or a comma
Explanation Uploaded by User
New Auto-Interp
Negative Logits
Honest
-0.74
Witness
-0.70
stown
-0.68
Finish
-0.66
Hills
-0.66
BUG
-0.66
Fla
-0.65
FTWARE
-0.64
Brother
-0.64
Hig
-0.63
POSITIVE LOGITS
are
1.23
aren
1.20
cannot
1.08
exist
1.04
were
1.02
differ
1.02
prolifer
1.01
contain
1.00
existed
0.99
weren
0.97
Activations Density 0.392%