INDEX
Explanations
terms or phrases used to describe or categorize concepts
specific terminology and technical jargon
New Auto-Interp
Negative Logits
tsky
-0.69
thumbnails
-0.65
oppable
-0.63
phis
-0.61
aukee
-0.61
uala
-0.61
recip
-0.60
unsus
-0.59
igators
-0.58
illance
-0.58
POSITIVE LOGITS
coined
1.24
referring
1.08
synonymous
1.04
ciation
1.00
meaning
0.99
shorthand
0.94
describing
0.93
loosely
0.91
refers
0.86
used
0.86
Activations Density 0.104%