INDEX
Explanations
references to the concept of "whole" or "entirety."
New Auto-Interp
Negative Logits
nakalista
-0.78
brücken
-0.68
estimat
-0.67
bacilli
-0.66
Brutus
-0.64
dci
-0.63
Platon
-0.63
pathologist
-0.63
apunov
-0.62
rrggbb
-0.60
POSITIVE LOGITS
entire
1.21
whole
1.15
whole
1.03
Whole
1.00
Whole
0.97
WHOLE
0.96
thing
0.94
ENTIRE
0.88
entire
0.88
Entire
0.88
Activations Density 0.061%