INDEX
Explanations
highly emphasized and distinct terms within a text
instances of the term "int" or related variations indicating intensity, interaction, or introspection
New Auto-Interp
Negative Logits
Accessory
-0.68
hyde
-0.67
wagen
-0.62
bial
-0.59
awaru
-0.59
Mbps
-0.58
condos
-0.57
Fram
-0.57
ggles
-0.56
innovation
-0.55
POSITIVE LOGITS
itial
1.10
ensions
0.95
ract
0.87
itially
0.86
ruct
0.86
ension
0.86
repre
0.84
ents
0.82
rep
0.80
ently
0.79
Activations Density 0.016%