INDEX
Explanations
phrases that are likely identifiers of specific individuals
instances of a specific character or symbol
New Auto-Interp
Negative Logits
multiplying
-0.71
guiActiveUnfocused
-0.68
sweeping
-0.68
demoral
-0.68
scattering
-0.67
versa
-0.66
shack
-0.65
scaven
-0.64
mid
-0.64
Veg
-0.63
POSITIVE LOGITS
£
1.08
ı
1.05
¬
0.94
İ
0.92
§
0.92
Ĵ
0.91
¹
0.91
į
0.87
Ń
0.87
¡
0.87
Activations Density 0.126%