INDEX
Explanations
instances where specific characters or symbols appear in succession
punctuation marks indicating the end of statements or phrases
New Auto-Interp
Negative Logits
neighb
-0.77
kettle
-0.71
cabbage
-0.69
feared
-0.67
resisted
-0.66
clock
-0.66
brut
-0.64
board
-0.64
scen
-0.64
ellery
-0.63
POSITIVE LOGITS
Retrieved
0.94
Accessed
0.88
aspx
0.84
illin
0.79
mobi
0.77
wcsstore
0.76
wav
0.74
jpg
0.73
zip
0.73
IGHTS
0.72
Activations Density 0.063%