INDEX
Explanations
differentiating characteristics or qualities within a given context
references to varying categories and types
New Auto-Interp
Negative Logits
isman
-0.72
arton
-0.67
ItemTracker
-0.64
earnest
-0.58
irony
-0.58
gotten
-0.57
ridor
-0.57
understatement
-0.56
excess
-0.56
patience
-0.55
POSITIVE LOGITS
depending
1.45
paces
1.14
differing
1.14
Different
1.07
varying
0.98
Different
0.96
vying
0.95
styles
0.95
depending
0.95
different
0.94
Activations Density 0.247%