INDEX
Explanations
profound or impactful statements
frequent occurrences or references to sources
New Auto-Interp
Negative Logits
withd
-0.90
cabbage
-0.82
challeng
-0.82
inver
-0.80
brut
-0.80
attacker
-0.78
fermented
-0.78
flowering
-0.77
crabs
-0.76
mosqu
-0.76
POSITIVE LOGITS
Accessed
1.26
jpg
1.15
Retrieved
1.15
Org
1.11
org
1.08
txt
1.01
0.95
com
0.92
Contains
0.92
png
0.92
Activations Density 0.242%