INDEX
Explanations
patterns or repetitions in data
references to recurring themes or trends
New Auto-Interp
Negative Logits
grim
-0.78
igious
-0.73
ilings
-0.71
zona
-0.68
minent
-0.68
=-=-=-=-
-0.67
glomer
-0.66
UST
-0.66
igion
-0.65
ocado
-0.65
POSITIVE LOGITS
pattern
1.21
patterns
1.12
Pattern
1.10
Pattern
1.03
pattern
1.00
Patterns
1.00
eering
0.98
ĸļ
0.97
repeats
0.81
atile
0.75
Activations Density 0.007%