INDEX
Explanations
phrases containing the word "ble" with high activations
words or phrases related to gambling
New Auto-Interp
Negative Logits
ohn
-0.78
arro
-0.77
alli
-0.75
ellar
-0.71
erva
-0.68
799
-0.68
rior
-0.67
ivals
-0.67
orters
-0.67
ahime
-0.66
POSITIVE LOGITS
bles
1.16
theless
1.10
bling
0.94
ble
0.92
grass
0.90
bled
0.87
bly
0.86
vous
0.83
tt
0.83
ton
0.81
Activations Density 0.029%