INDEX
Explanations
instances where the word "largely" occurs
phrases indicating prevailing trends or characteristics
New Auto-Interp
Negative Logits
anth
-0.75
Chaser
-0.72
abad
-0.69
atures
-0.68
yle
-0.68
Kard
-0.66
endi
-0.66
Ambassador
-0.66
Probe
-0.65
yers
-0.65
POSITIVE LOGITS
unchanged
0.94
reliant
0.91
overlooked
0.90
unaffected
0.89
consist
0.86
comprised
0.86
consisted
0.85
absent
0.84
lacking
0.83
relying
0.83
Activations Density 0.022%