INDEX
Explanations
phrases related to undeniable proof or evidence
New Auto-Interp
Negative Logits
Hunting
-0.72
funer
-0.67
Housing
-0.62
Solitaire
-0.62
quarters
-0.61
Franch
-0.60
redes
-0.60
dust
-0.60
nets
-0.60
straw
-0.60
POSITIVE LOGITS
putable
1.20
akable
1.11
ably
1.05
cipline
0.99
arkable
0.98
uously
0.98
cipl
0.98
ensible
0.96
puted
0.96
uably
0.94
Activations Density 0.027%